{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Pandas for Exploratory Data Analysis I \n",
    "by [@josephofiowa](https://twitter.com/josephofiowa)\n",
    "\n",
    "Pandas is the most prominent Python library for exploratory data analysis (EDA). The functions Pandas supports are integral to understanding, formatting, and preparing our data. Formally, we use Pandas to investigate, wrangle, munge, and clean our data. Pandas is the Swiss Army Knife of data manipulation!\n",
    "\n",
    "\n",
    "We'll have two coding-heavy sessions on Pandas. In this one, we'll use Pandas to:\n",
    " - Read in a dataset\n",
    " - Investigate a dataset's integrity\n",
    " - Filter, sort, and manipulate a DataFrame's series"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## About the Dataset: Adventureworks Cycles\n",
    "\n",
    "<img align=\"right\" src=\"http://lh6.ggpht.com/_XjcDyZkJqHg/TPaaRcaysbI/AAAAAAAAAFo/b1U3q-qbTjY/AdventureWorks%20Logo%5B5%5D.png?imgmax=800\">\n",
    "\n",
    "For today's Pandas exercises, we will be using a dataset developed by Microsoft for training purposes in SQL server, known the [Adventureworks Cycles 2014OLTP Database](https://github.com/Microsoft/sql-server-samples/releases/tag/adventureworks). It is based on a fictitious company called Adventure Works Cycles (AWC), a multinational manufacturer and seller of bicycles and accessories. The company is based in Bothell, Washington, USA and has regional sales offices in several countries. We will be looking at a single table from this database, the Production.Product table, which outlines some of the products this company sells. \n",
    "\n",
    "A full data dictionary can be viewed [here](https://www.sqldatadictionary.com/AdventureWorks2014/).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a closer look at the Production.Product table [data dictionary](https://www.sqldatadictionary.com/AdventureWorks2014/Production.Product.html), which is a description of the fields (columns) in the table (the .csv file we will import below):\n",
    "- **ProductID** - Primary key for Product records.\n",
    "- **Name** - Name of the product.\n",
    "- **ProductNumber** - Unique product identification number.\n",
    "- **MakeFlag** - 0 = Product is purchased, 1 = Product is manufactured in-house.\n",
    "- **FinishedGoodsFlag** - 0 = Product is not a salable item. 1 = Product is salable.\n",
    "- **Color** - Product color.\n",
    "- **SafetyStockLevel** - Minimum inventory quantity.\n",
    "- **ReorderPoint** - Inventory level that triggers a purchase order or work order.\n",
    "- **StandardCost** - Standard cost of the product.\n",
    "- **ListPrice** - Selling price.\n",
    "- **Size** - Product size.\n",
    "- **SizeUnitMeasureCode** - Unit of measure for the Size column.\n",
    "- **WeightUnitMeasureCode** - Unit of measure for the Weight column.\n",
    "- **DaysToManufacture** - Number of days required to manufacture the product.\n",
    "- **ProductLine** - R = Road, M = Mountain, T = Touring, S = Standard\n",
    "- **Class** - H = High, M = Medium, L = Low\n",
    "- **Style** - W = Womens, M = Mens, U = Universal\n",
    "- **ProductSubcategoryID** - Product is a member of this product subcategory. Foreign key to ProductSubCategory.ProductSubCategoryID.\n",
    "- **ProductModelID** - Product is a member of this product model. Foreign key to ProductModel.ProductModelID.\n",
    "- **SellStartDate** - Date the product was available for sale.\n",
    "- **SellEndDate** - Date the product was no longer available for sale.\n",
    "- **DiscontinuedDate** - Date the product was discontinued.\n",
    "- **rowguid** - ROWGUIDCOL number uniquely identifying the record. Used to support a merge replication sample.\n",
    "- **ModifiedDate** - Date and time the record was last updated.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Importing Pandas\n",
    "\n",
    "To [import a library](https://docs.python.org/3/reference/import.html), we write `import` and the library name. For Pandas, is it common to name the library `pd` so that when we reference a function from the Pandas library, we only write `pd` to reference the aliased [namespace](https://docs.python.org/3/tutorial/classes.html#python-scopes-and-namespaces) -- not `pandas`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "I am using pandas Version: 1.2.1.\n",
      "It is installed at: ['C:\\\\Users\\\\Andrew\\\\Anaconda3\\\\lib\\\\site-packages\\\\pandas']\n"
     ]
    }
   ],
   "source": [
    "# we can see the details about the imported package by referencing its private class propertys:\n",
    "print(f'I am using {pd.__name__} \\\n",
    "Version: {pd.__version__}.\\n\\\n",
    "It is installed at: {pd.__path__}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reading in Data\n",
    "\n",
    "Pandas dramatically simplifies the process of reading in data. When we say \"reading in data,\" we mean loading a file into our machine's memory.\n",
    "\n",
    "When you have a CSV, for example, and then you double-click to open it in Microsoft Excel, the open file is \"read into memory.\" You can now manipulate the CSV.\n",
    "\n",
    "When we read data into memory in Python, we are creating an object. We will soon explore this object. _(And, as an aside, when we have a file that is greater than the size of our computer's memory, this is approaching \"Big Data.\")_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because we are working with a CSV, we will use the [read CSV](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) method.<br>A [delimiter](https://en.wikipedia.org/wiki/Delimiter-separated_values) is a character that separates fields (columns) in the imported file. Just because a file says `.csv` does not necessarily mean that a comma is used as the delimiter. In this case, we have a tab character as the delimiter for our columns, so we will be using `sep='\\t'` to tell pandas to 'cut' the columns every time it sees a [tab character in the file](http://vim.wikia.com/wiki/Showing_the_ASCII_value_of_the_current_character)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "prod = pd.read_csv('../data/Production.Product.csv', sep='\\t')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProductID</th>\n",
       "      <th>Name</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1 rows × 25 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   ProductID             Name ProductNumber  MakeFlag  FinishedGoodsFlag  \\\n",
       "0          1  Adjustable Race       AR-5381         0                  0   \n",
       "\n",
       "  Color  SafetyStockLevel  ReorderPoint  StandardCost  ListPrice  ...  \\\n",
       "0   NaN              1000           750           0.0        0.0  ...   \n",
       "\n",
       "  ProductLine Class Style  ProductSubcategoryID  ProductModelID  \\\n",
       "0         NaN   NaN   NaN                   NaN             NaN   \n",
       "\n",
       "         SellStartDate SellEndDate DiscontinuedDate  \\\n",
       "0  2008-04-30 00:00:00         NaN              NaN   \n",
       "\n",
       "                                  rowguid                   ModifiedDate  \n",
       "0  {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}  2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[1 rows x 25 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod.head(1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Documentation Pause*\n",
    "\n",
    "How did we know how to use `pd.read_csv`? Let's take a look at the [documentation](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html). Note the first argument required (`filepath`).\n",
    "> Take a moment to dissect other arguments and options when reading in data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have just created a data structure called a `DataFrame`. See?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.frame.DataFrame"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(prod)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inspecting our DataFrame: The basics\n",
    "\n",
    "We'll now perform basic operations on the DataFrame, denoted with comments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProductID</th>\n",
       "      <th>Name</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{58AE3C20-4F3A-4749-A7D4-D568806CC537}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>800</td>\n",
       "      <td>600</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 25 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   ProductID             Name ProductNumber  MakeFlag  FinishedGoodsFlag  \\\n",
       "0          1  Adjustable Race       AR-5381         0                  0   \n",
       "1          2     Bearing Ball       BA-8327         0                  0   \n",
       "2          3  BB Ball Bearing       BE-2349         1                  0   \n",
       "\n",
       "  Color  SafetyStockLevel  ReorderPoint  StandardCost  ListPrice  ...  \\\n",
       "0   NaN              1000           750           0.0        0.0  ...   \n",
       "1   NaN              1000           750           0.0        0.0  ...   \n",
       "2   NaN               800           600           0.0        0.0  ...   \n",
       "\n",
       "  ProductLine Class Style  ProductSubcategoryID  ProductModelID  \\\n",
       "0         NaN   NaN   NaN                   NaN             NaN   \n",
       "1         NaN   NaN   NaN                   NaN             NaN   \n",
       "2         NaN   NaN   NaN                   NaN             NaN   \n",
       "\n",
       "         SellStartDate SellEndDate DiscontinuedDate  \\\n",
       "0  2008-04-30 00:00:00         NaN              NaN   \n",
       "1  2008-04-30 00:00:00         NaN              NaN   \n",
       "2  2008-04-30 00:00:00         NaN              NaN   \n",
       "\n",
       "                                  rowguid                   ModifiedDate  \n",
       "0  {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}  2014-02-08 10:01:36.827000000  \n",
       "1  {58AE3C20-4F3A-4749-A7D4-D568806CC537}  2014-02-08 10:01:36.827000000  \n",
       "2  {9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}  2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 25 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# print the first and last 3 rows\n",
    "prod.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that `.head()` is a method (denoted by parantheses), so it takes arguments."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Class Question:** \n",
    "- What do you think changes if we pass a different number `head()` argument?\n",
    "- How would we print the last 5 rows?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(504, 25)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# identify the shape (rows by columns)\n",
    "prod.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we have 504 rows, and 25 columns. This is a tuple, so we can extract the parts using indexing:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "504"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# print the number of rows\n",
    "prod.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the index\n",
    "An [index](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.html) is an immutable ndarray implementing an ordered, sliceable set. The basic object storing axis labels for all pandas objects. Think of it as a 'row address' for your data frame (table). It is best practice to explicitly set the index of your dataframe, as these are commonly used as [primary keys](https://en.wikipedia.org/wiki/Primary_key) which can be used to [join](https://www.w3schools.com/sql/sql_join.asp) your dataframe to other dataframes.\n",
    "\n",
    "The dataframe can store different types (int, string, datetime), and when importing data will automatically assign a number to each row, starting at zero and counting up. You can overwrite this, which is what we are going to do."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "RangeIndex(start=0, stop=504, step=1)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# displaying the index as it sits (auto-generated upon import)\n",
    "prod.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "None\n"
     ]
    }
   ],
   "source": [
    "# also note that our auto-generated index has no name\n",
    "print(prod.index.name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProductID</th>\n",
       "      <th>Name</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Adjustable Race</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Bearing Ball</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>BB Ball Bearing</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ProductID             Name\n",
       "0          1  Adjustable Race\n",
       "1          2     Bearing Ball\n",
       "2          3  BB Ball Bearing"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Here we are looking at three columns;\n",
    "# the one on the left is the index (automatically generated upon import by pandas)\n",
    "# 'ProductID' is our PK (primary key) from our imported table. 'Name' is a data column.\n",
    "# Notice that the generated index starts at zero and our PK starts at 1.\n",
    "pd.DataFrame(prod.head(3)[['ProductID', 'Name']])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Name\n",
       "ProductID                 \n",
       "1          Adjustable Race\n",
       "2             Bearing Ball\n",
       "3          BB Ball Bearing"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Setting the index overwrites the automatically generated index\n",
    "# with our PK, which resided in the 'ProductID' column.\n",
    "prod.set_index('ProductID', inplace=True)\n",
    "prod.head(3)[['Name']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Int64Index([  1,   2,   3,   4, 316, 317, 318, 319, 320, 321,\n",
       "            ...\n",
       "            990, 991, 992, 993, 994, 995, 996, 997, 998, 999],\n",
       "           dtype='int64', name='ProductID', length=504)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Note how our index property has changed as a result\n",
    "prod.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ProductID\n"
     ]
    }
   ],
   "source": [
    "# And our index has also inherited the name of our 'ProductID' column\n",
    "print(prod.index.name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Column headers and datatypes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Name', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color',\n",
       "       'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size',\n",
       "       'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight',\n",
       "       'DaysToManufacture', 'ProductLine', 'Class', 'Style',\n",
       "       'ProductSubcategoryID', 'ProductModelID', 'SellStartDate',\n",
       "       'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# print the columns\n",
    "prod.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>DataType</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Name</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductNumber</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MakeFlag</th>\n",
       "      <td>int64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <td>int64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Color</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <td>int64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ReorderPoint</th>\n",
       "      <td>int64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>StandardCost</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ListPrice</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Size</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SizeUnitMeasureCode</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>WeightUnitMeasureCode</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Weight</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>DaysToManufacture</th>\n",
       "      <td>int64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductLine</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Class</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Style</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductModelID</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SellStartDate</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SellEndDate</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <td>float64</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rowguid</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ModifiedDate</th>\n",
       "      <td>object</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      DataType\n",
       "Name                    object\n",
       "ProductNumber           object\n",
       "MakeFlag                 int64\n",
       "FinishedGoodsFlag        int64\n",
       "Color                   object\n",
       "SafetyStockLevel         int64\n",
       "ReorderPoint             int64\n",
       "StandardCost           float64\n",
       "ListPrice              float64\n",
       "Size                    object\n",
       "SizeUnitMeasureCode     object\n",
       "WeightUnitMeasureCode   object\n",
       "Weight                 float64\n",
       "DaysToManufacture        int64\n",
       "ProductLine             object\n",
       "Class                   object\n",
       "Style                   object\n",
       "ProductSubcategoryID   float64\n",
       "ProductModelID         float64\n",
       "SellStartDate           object\n",
       "SellEndDate             object\n",
       "DiscontinuedDate       float64\n",
       "rowguid                 object\n",
       "ModifiedDate            object"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# examine the datatypes of the columns\n",
    "# note that these were automatically inferred by pandas upon import!\n",
    "pd.DataFrame(prod.dtypes, columns=['DataType'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Class Question:** Why do datatypes matter? What operations could we perform on some datatypes that we could not on others? Note the importance of this in checking dataset integrity."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting a Column\n",
    "\n",
    "We can select columns in two ways. Either we treat the column as an attribute of the DataFrame or we index the DataFrame for a specific element (in this case, the element is a column name)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "If I use SINGLE brackets, pandas returns a <class 'pandas.core.series.Series'> \n",
      "If I use DOUBLE brackets, pandas returns a <class 'pandas.core.frame.DataFrame'>\n"
     ]
    }
   ],
   "source": [
    "print('If I use SINGLE brackets, pandas returns a', \n",
    "      type(prod['Name']),\n",
    "      '\\nIf I use DOUBLE brackets, pandas returns a',\n",
    "      type(prod[['Name']]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ProductID\n",
       "1    Adjustable Race\n",
       "2       Bearing Ball\n",
       "3    BB Ball Bearing\n",
       "Name: Name, dtype: object"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# select the Name column, Series object\n",
    "prod['Name'].head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Name\n",
       "ProductID                 \n",
       "1          Adjustable Race\n",
       "2             Bearing Ball\n",
       "3          BB Ball Bearing"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# select the Name column, DataFrame object\n",
    "prod[['Name']].head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>ProductNumber</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Name ProductNumber\n",
       "ProductID                               \n",
       "1          Adjustable Race       AR-5381\n",
       "2             Bearing Ball       BA-8327\n",
       "3          BB Ball Bearing       BE-2349"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# selecting > 1 column (must use double brackets!)\n",
    "prod[['Name', 'ProductNumber']].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Class Question:** What if we wanted to select a column that has a space in it? Which method from the above two would we use? Why?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## loc and iloc\n",
    "\n",
    "`loc` and `iloc` are ways to select multiple rows and columns _at the same time_. \n",
    "- `loc` uses label-based selection (the index values and column names)\n",
    "- `iloc` used position-based selection (the position of the row/column within the df)\n",
    "- For each of them, you specify the rows you want first, followed by the columns. The row values are required, the columns are not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      Name ProductNumber  MakeFlag  FinishedGoodsFlag Color\n",
       "ProductID                                                                  \n",
       "1          Adjustable Race       AR-5381         0                  0   NaN\n",
       "2             Bearing Ball       BA-8327         0                  0   NaN\n",
       "3          BB Ball Bearing       BE-2349         1                  0   NaN"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# use loc to select columns bewteen 'Name' and `Color, and rows with index value 1-3\n",
    "# Note that both endpoints are included for the row and column ranges; this is slightly different than the Python range() function or list slicing\n",
    "prod.loc[1:3, 'Name':'Color']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>{58AE3C20-4F3A-4749-A7D4-D568806CC537}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>{9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>{ECFED6CB-51FF-49B5-B06C-7D8AC834DB8B}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>316</th>\n",
       "      <td>{E73E9750-603B-4131-89F5-3DD15ED5FF80}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "1          {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}   \n",
       "2          {58AE3C20-4F3A-4749-A7D4-D568806CC537}   \n",
       "3          {9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}   \n",
       "4          {ECFED6CB-51FF-49B5-B06C-7D8AC834DB8B}   \n",
       "316        {E73E9750-603B-4131-89F5-3DD15ED5FF80}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "1          2014-02-08 10:01:36.827000000  \n",
       "2          2014-02-08 10:01:36.827000000  \n",
       "3          2014-02-08 10:01:36.827000000  \n",
       "4          2014-02-08 10:01:36.827000000  \n",
       "316        2014-02-08 10:01:36.827000000  "
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# use iloc to get the first 5 rows and the last two columns\n",
    "# The 0 at the start of the row range is optional\n",
    "# iloc *does* work like list slicing in that the endpoint of the range is excluded\n",
    "prod.iloc[0:5, -2:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Renaming Columns\n",
    "\n",
    "Perhaps we want to rename our columns. There are a few options for doing this.\n",
    "\n",
    "Renaming **specific** columns by using a dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ProductName</th>\n",
       "      <th>Number</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{58AE3C20-4F3A-4749-A7D4-D568806CC537}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>800</td>\n",
       "      <td>600</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "               ProductName   Number  MakeFlag  FinishedGoodsFlag Color  \\\n",
       "ProductID                                                                \n",
       "1          Adjustable Race  AR-5381         0                  0   NaN   \n",
       "2             Bearing Ball  BA-8327         0                  0   NaN   \n",
       "3          BB Ball Bearing  BE-2349         1                  0   NaN   \n",
       "\n",
       "           SafetyStockLevel  ReorderPoint  StandardCost  ListPrice Size  ...  \\\n",
       "ProductID                                                                ...   \n",
       "1                      1000           750           0.0        0.0  NaN  ...   \n",
       "2                      1000           750           0.0        0.0  NaN  ...   \n",
       "3                       800           600           0.0        0.0  NaN  ...   \n",
       "\n",
       "          ProductLine Class  Style  ProductSubcategoryID ProductModelID  \\\n",
       "ProductID                                                                 \n",
       "1                 NaN   NaN    NaN                   NaN            NaN   \n",
       "2                 NaN   NaN    NaN                   NaN            NaN   \n",
       "3                 NaN   NaN    NaN                   NaN            NaN   \n",
       "\n",
       "                 SellStartDate SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                                      \n",
       "1          2008-04-30 00:00:00         NaN               NaN   \n",
       "2          2008-04-30 00:00:00         NaN               NaN   \n",
       "3          2008-04-30 00:00:00         NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "1          {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}   \n",
       "2          {58AE3C20-4F3A-4749-A7D4-D568806CC537}   \n",
       "3          {9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "1          2014-02-08 10:01:36.827000000  \n",
       "2          2014-02-08 10:01:36.827000000  \n",
       "3          2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# rename one or more columns with a dictionary. Note: inplace=False will return a new dataframe,\n",
    "# but leave the original dataframe untouched. Change this to True to modify the original dataframe.\n",
    "prod.rename(columns={'Name': 'ProductName', 'ProductNumber':'Number'}, inplace=False).head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Renaming **ALL** columns with a new list of column names.\n",
    "\n",
    "Note that the `pd.DataFrame.columns` property can be cast to a `list` type. Originally, it's a `pd.core.indexes.base.Index` object:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "My columns look like:\n",
      " Index(['Name', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color',\n",
      "       'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size',\n",
      "       'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight',\n",
      "       'DaysToManufacture', 'ProductLine', 'Class', 'Style',\n",
      "       'ProductSubcategoryID', 'ProductModelID', 'SellStartDate',\n",
      "       'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate'],\n",
      "      dtype='object')\n",
      "\n",
      "And the type is:\n",
      " <class 'pandas.core.indexes.base.Index'>\n"
     ]
    }
   ],
   "source": [
    "print('My columns look like:\\n', prod.columns)\n",
    "print('\\nAnd the type is:\\n', type(prod.columns))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can explicitly cast these to a list object as such, by using the built-in `list()` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Now my columns look like:\n",
      " ['Name', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color', 'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size', 'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight', 'DaysToManufacture', 'ProductLine', 'Class', 'Style', 'ProductSubcategoryID', 'ProductModelID', 'SellStartDate', 'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate']\n",
      "\n",
      "And are of type:\n",
      " <class 'list'>\n"
     ]
    }
   ],
   "source": [
    "print('Now my columns look like:\\n', list(prod.columns))\n",
    "print('\\nAnd are of type:\\n', type(list(prod.columns)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can place these columns into a variable, `cols`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Name', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color', 'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size', 'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight', 'DaysToManufacture', 'ProductLine', 'Class', 'Style', 'ProductSubcategoryID', 'ProductModelID', 'SellStartDate', 'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate']\n"
     ]
    }
   ],
   "source": [
    "# declare a list of strings - these strings will become the new column names\n",
    "cols = list(prod.columns)\n",
    "print(cols)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use list indexing to mutate the columns we want:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['NewName', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color', 'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size', 'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight', 'DaysToManufacture', 'ProductLine', 'Class', 'Style', 'ProductSubcategoryID', 'ProductModelID', 'SellStartDate', 'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate']\n"
     ]
    }
   ],
   "source": [
    "cols[0] = 'NewName'\n",
    "print(cols)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we can set the `pd.DataFrame.columns` property (this is a settable class property), to the new `cols` list, overwriting the existing columns header names:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index(['NewName', 'ProductNumber', 'MakeFlag', 'FinishedGoodsFlag', 'Color',\n",
      "       'SafetyStockLevel', 'ReorderPoint', 'StandardCost', 'ListPrice', 'Size',\n",
      "       'SizeUnitMeasureCode', 'WeightUnitMeasureCode', 'Weight',\n",
      "       'DaysToManufacture', 'ProductLine', 'Class', 'Style',\n",
      "       'ProductSubcategoryID', 'ProductModelID', 'SellStartDate',\n",
      "       'SellEndDate', 'DiscontinuedDate', 'rowguid', 'ModifiedDate'],\n",
      "      dtype='object')\n"
     ]
    }
   ],
   "source": [
    "# Note that our first column name has changed from 'Name' to 'NewName'\n",
    "prod.columns = cols\n",
    "print(prod.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   NewName\n",
       "ProductID                 \n",
       "1          Adjustable Race\n",
       "2             Bearing Ball\n",
       "3          BB Ball Bearing"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod.head(3)[['NewName']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Common Column Operations\n",
    "\n",
    "While this is non-comprehensive, these are a few key column-specific data checks.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Descriptive statistics:**  the minimum, first quartile, median, third quartile, and maximum.\n",
    "\n",
    "(And more! The mean too.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Five Number Summary (all assumes numeric data):\n",
    "- **Min:** The smallest value in the column\n",
    "- **Max:** The largest value in the column\n",
    "- **Quartile:** A quartile is one fourth of our data\n",
    "    - **First quartile:** This is the bottom most 25 percent\n",
    "    - **Median:** The middle value. (Line all values biggest to smallest - median is the middle!) Also the 50th percentile\n",
    "    - **Third quartile:** This the the top 75 percentile of our data\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](https://www.mathsisfun.com/data/images/quartiles-a.svg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>StandardCost</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>504.000000</td>\n",
       "      <td>504.000000</td>\n",
       "      <td>504.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>0.474206</td>\n",
       "      <td>535.150794</td>\n",
       "      <td>258.602961</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>0.499830</td>\n",
       "      <td>374.112954</td>\n",
       "      <td>461.632808</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>4.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>100.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>0.000000</td>\n",
       "      <td>500.000000</td>\n",
       "      <td>23.372200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>1000.000000</td>\n",
       "      <td>317.075825</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>1.000000</td>\n",
       "      <td>1000.000000</td>\n",
       "      <td>2171.294200</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         MakeFlag  SafetyStockLevel  StandardCost\n",
       "count  504.000000        504.000000    504.000000\n",
       "mean     0.474206        535.150794    258.602961\n",
       "std      0.499830        374.112954    461.632808\n",
       "min      0.000000          4.000000      0.000000\n",
       "25%      0.000000        100.000000      0.000000\n",
       "50%      0.000000        500.000000     23.372200\n",
       "75%      1.000000       1000.000000    317.075825\n",
       "max      1.000000       1000.000000   2171.294200"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# note - describe *default* only checks numeric datatypes\n",
    "prod[['MakeFlag', 'SafetyStockLevel', 'StandardCost']].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Value Counts:** `pd.Series.value_counts()` count the occurrence of each value within our series."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Black           93\n",
       "Silver          43\n",
       "Red             38\n",
       "Yellow          36\n",
       "Blue            26\n",
       "Multi            8\n",
       "Silver/Black     7\n",
       "White            4\n",
       "Grey             1\n",
       "Name: Color, dtype: int64"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# show the most popular product colors (aggregated by count, descending by default)\n",
    "prod['Color'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Unique values:** Determine the number of distinct values within a given series."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([nan, 'Black', 'Silver', 'Red', 'White', 'Blue', 'Multi', 'Yellow',\n",
       "       'Grey', 'Silver/Black'], dtype=object)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# What are the unique colors for the products?\n",
    "prod['Color'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "9"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# HOW MANY distinct colors are there?\n",
    "prod['Color'].nunique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# We can also include nulls with .nunique() as such:\n",
    "prod['Color'].nunique(dropna=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Filtering on a Single Condition\n",
    "\n",
    "Filtering and sorting are key processes that allow us to drill into the 'nitty gritty' and cross sections of our dataset.\n",
    "\n",
    "To filter, we use a process called **Boolean Filtering**, wherein we define a Boolean condition, and use that Boolean condition to filer on our DataFrame."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall: our given dataset has a column `Color`. Let's see if we can find all products that are `Black`. Let's take a look at the first 10 rows of the dataframe to see how it looks as-is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ProductID\n",
       "1         NaN\n",
       "2         NaN\n",
       "3         NaN\n",
       "4         NaN\n",
       "316       NaN\n",
       "317     Black\n",
       "318     Black\n",
       "319     Black\n",
       "320    Silver\n",
       "321    Silver\n",
       "Name: Color, dtype: object"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod['Color'].head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By applying a `boolean mask` to this dataframe, `== 'Black'`, we can get the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ProductID\n",
       "1      False\n",
       "2      False\n",
       "3      False\n",
       "4      False\n",
       "316    False\n",
       "317     True\n",
       "318     True\n",
       "319     True\n",
       "320    False\n",
       "321    False\n",
       "Name: Color, dtype: bool"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod['Color'].head(10) == 'Black'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can use that 'mask' from above, and apply it to our full dataframe. Every time we have a `True` in a row, we return the row. If we have a `False` in that row, we do not return it. The result is a dataframe that only has rows where `Color` is `Black`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>317</th>\n",
       "      <td>LL Crankarm</td>\n",
       "      <td>CA-5965</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>L</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{3C9D10B7-A6B2-4774-9963-C19DCEE72FEA}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>318</th>\n",
       "      <td>ML Crankarm</td>\n",
       "      <td>CA-6738</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>M</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{EABB9A92-FA07-4EAB-8955-F0517B4A4CA7}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>319</th>\n",
       "      <td>HL Crankarm</td>\n",
       "      <td>CA-7457</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{7D3FD384-4F29-484B-86FA-4206E276FE58}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "               NewName ProductNumber  MakeFlag  FinishedGoodsFlag  Color  \\\n",
       "ProductID                                                                  \n",
       "317        LL Crankarm       CA-5965         0                  0  Black   \n",
       "318        ML Crankarm       CA-6738         0                  0  Black   \n",
       "319        HL Crankarm       CA-7457         0                  0  Black   \n",
       "\n",
       "           SafetyStockLevel  ReorderPoint  StandardCost  ListPrice Size  ...  \\\n",
       "ProductID                                                                ...   \n",
       "317                     500           375           0.0        0.0  NaN  ...   \n",
       "318                     500           375           0.0        0.0  NaN  ...   \n",
       "319                     500           375           0.0        0.0  NaN  ...   \n",
       "\n",
       "          ProductLine Class  Style  ProductSubcategoryID ProductModelID  \\\n",
       "ProductID                                                                 \n",
       "317               NaN    L     NaN                   NaN            NaN   \n",
       "318               NaN    M     NaN                   NaN            NaN   \n",
       "319               NaN   NaN    NaN                   NaN            NaN   \n",
       "\n",
       "                 SellStartDate SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                                      \n",
       "317        2008-04-30 00:00:00         NaN               NaN   \n",
       "318        2008-04-30 00:00:00         NaN               NaN   \n",
       "319        2008-04-30 00:00:00         NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "317        {3C9D10B7-A6B2-4774-9963-C19DCEE72FEA}   \n",
       "318        {EABB9A92-FA07-4EAB-8955-F0517B4A4CA7}   \n",
       "319        {7D3FD384-4F29-484B-86FA-4206E276FE58}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "317        2014-02-08 10:01:36.827000000  \n",
       "318        2014-02-08 10:01:36.827000000  \n",
       "319        2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod[prod['Color'] == 'Black'].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's calculate the **average ListPrice** for the **salable products**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Think: What are the component parts of this problem?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>680</th>\n",
       "      <td>HL Road Frame - Black, 58</td>\n",
       "      <td>FR-R92B-58</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>1059.3100</td>\n",
       "      <td>1431.50</td>\n",
       "      <td>58</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>14.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{43DD68D6-14A4-461F-9069-55309D90EA7E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>706</th>\n",
       "      <td>HL Road Frame - Red, 58</td>\n",
       "      <td>FR-R92R-58</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Red</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>1059.3100</td>\n",
       "      <td>1431.50</td>\n",
       "      <td>58</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>14.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{9540FF17-2712-4C90-A3D1-8CE5568B2462}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>707</th>\n",
       "      <td>Sport-100 Helmet, Red</td>\n",
       "      <td>HL-U509-R</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>Red</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>13.0863</td>\n",
       "      <td>34.99</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>S</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>31.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{2E1EF41A-C08A-4FF6-8ADA-BDE58B64A712}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                             NewName ProductNumber  MakeFlag  \\\n",
       "ProductID                                                      \n",
       "680        HL Road Frame - Black, 58    FR-R92B-58         1   \n",
       "706          HL Road Frame - Red, 58    FR-R92R-58         1   \n",
       "707            Sport-100 Helmet, Red     HL-U509-R         0   \n",
       "\n",
       "           FinishedGoodsFlag  Color  SafetyStockLevel  ReorderPoint  \\\n",
       "ProductID                                                             \n",
       "680                        1  Black               500           375   \n",
       "706                        1    Red               500           375   \n",
       "707                        1    Red                 4             3   \n",
       "\n",
       "           StandardCost  ListPrice Size  ... ProductLine Class  Style  \\\n",
       "ProductID                                ...                            \n",
       "680           1059.3100    1431.50   58  ...          R     H      U    \n",
       "706           1059.3100    1431.50   58  ...          R     H      U    \n",
       "707             13.0863      34.99  NaN  ...          S    NaN    NaN   \n",
       "\n",
       "           ProductSubcategoryID ProductModelID        SellStartDate  \\\n",
       "ProductID                                                             \n",
       "680                        14.0            6.0  2008-04-30 00:00:00   \n",
       "706                        14.0            6.0  2008-04-30 00:00:00   \n",
       "707                        31.0           33.0  2011-05-31 00:00:00   \n",
       "\n",
       "          SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                 \n",
       "680               NaN               NaN   \n",
       "706               NaN               NaN   \n",
       "707               NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "680        {43DD68D6-14A4-461F-9069-55309D90EA7E}   \n",
       "706        {9540FF17-2712-4C90-A3D1-8CE5568B2462}   \n",
       "707        {2E1EF41A-C08A-4FF6-8ADA-BDE58B64A712}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "680        2014-02-08 10:01:36.827000000  \n",
       "706        2014-02-08 10:01:36.827000000  \n",
       "707        2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# First, we need to get salable items. Use your data dictionary from the beginning of this lesson.\n",
    "prod[prod['FinishedGoodsFlag'] == 1].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we need to find average list price of those above items. Let's just get the 'ListPrice' column for starters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ProductID\n",
       "680    1431.50\n",
       "706    1431.50\n",
       "707      34.99\n",
       "Name: ListPrice, dtype: float64"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod[prod['FinishedGoodsFlag'] == 1]['ListPrice'].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the average of that column, just take `.mean()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "744.595220338982"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod[prod['FinishedGoodsFlag'] == 1]['ListPrice'].mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can take a shortcut and just use `.describe()` here:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count     295.000000\n",
       "mean      744.595220\n",
       "std       892.563172\n",
       "min         2.290000\n",
       "25%        66.745000\n",
       "50%       337.220000\n",
       "75%      1100.240000\n",
       "max      3578.270000\n",
       "Name: ListPrice, dtype: float64"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod[prod['FinishedGoodsFlag'] == 1]['ListPrice'].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Sneak peek**: Another handy trick is to use `.hist()` to get a distribution of a continuous variable - in this case, `ListPrice`. We'll cover this more in future lessons:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x1ba1bfe8ac8>"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD6CAYAAABamQdMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAASMElEQVR4nO3dfYxld33f8fenXmwek12zY+rs2tl1tIQ4iII1WG5JEcUJ2CZiXclIi6qyBUurBkPJk8K6SHXyB5KdtqFFTYk28cZLigyuA7VVSJOtMbUi4XXHxg9rFuPFdu3BG+8gYycpksHwzR/3DNyM787DvXPvzPx4v6TRPfd3fveez5y5+9kz5z5MqgpJUlv+wVoHkCStPstdkhpkuUtSgyx3SWqQ5S5JDbLcJalBS5Z7koNJTiY5umD8g0keSvJgkt/tG786yfFu3dvHEVqStLhNy5hzA/BfgE/ODyT5Z8Bu4HVV9VySs7rx84E9wM8DPwX87ySvrqrvL7aBrVu31o4dO4b6BiTpx9Xdd9/9raqaGrRuyXKvqjuS7Fgw/CvAtVX1XDfnZDe+G/h0N/5okuPAhcCXF9vGjh07mJmZWSqKJKlPkv93qnXDnnN/NfBPkxxJ8n+SvLEb3wY80TdvthuTJE3Qck7LnOp2W4CLgDcCNyU5D8iAuQM/3yDJPmAfwLnnnjtkDEnSIMMeuc8Cn62eu4AfAFu78XP65m0Hnhx0B1V1oKqmq2p6amrgKSNJ0pCGLff/AbwVIMmrgdOBbwG3AnuSnJFkJ7ALuGs1gkqSlm/J0zJJbgTeAmxNMgtcAxwEDnYvj/wusLd6Hy/5YJKbgK8CzwNXLfVKGUnS6st6+Mjf6enp8tUykrQySe6uqulB63yHqiQ1yHKXpAZZ7pLUoGFf575u7Nj/+TXb9mPXvmPNti1Ji/HIXZIaZLlLUoMsd0lqkOUuSQ2y3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDLHdJapDlLkkNstwlqUGWuyQ1aMlyT3Iwycnu76UuXPebSSrJ1u56knw8yfEk9ye5YByhJUmLW86R+w3AJQsHk5wD/BLweN/wpcCu7msf8InRI0qSVmrJcq+qO4CnB6z6GPBbQP9f2N4NfLJ67gQ2Jzl7VZJKkpZtqHPuSd4JfLOq7luwahvwRN/12W5MkjRBK/4ze0leCnwEeNug1QPGasAYSfbRO3XDueeeu9IYkqRFDHPk/jPATuC+JI8B24F7kvxDekfq5/TN3Q48OehOqupAVU1X1fTU1NQQMSRJp7Licq+qB6rqrKraUVU76BX6BVX1V8CtwHu6V81cBDxbVSdWN7IkaSnLeSnkjcCXgZ9NMpvkykWmfwF4BDgO/CHw/lVJKUlakSXPuVfVu5dYv6NvuYCrRo8lSRqF71CVpAZZ7pLUIMtdkhpkuUtSgyx3SWqQ5S5JDbLcJalBlrskNchyl6QGWe6S1CDLXZIaZLlLUoMsd0lqkOUuSQ2y3CWpQZa7JDXIcpekBlnuktSg5fwN1YNJTiY52jf275N8Lcn9ST6XZHPfuquTHE/yUJK3jyu4JOnUlnPkfgNwyYKxw8Brq+p1wNeBqwGSnA/sAX6+u81/TXLaqqWVJC3LkuVeVXcATy8Y+4uqer67eiewvVveDXy6qp6rqkeB48CFq5hXkrQMq3HO/X3An3XL24An+tbNdmMvkGRfkpkkM3Nzc6sQQ5I0b6RyT/IR4HngU/NDA6bVoNtW1YGqmq6q6ampqVFiSJIW2DTsDZPsBX4ZuLiq5gt8Fjinb9p24Mnh40mShjHUkXuSS4APA++squ/0rboV2JPkjCQ7gV3AXaPHlCStxJJH7kluBN4CbE0yC1xD79UxZwCHkwDcWVX/uqoeTHIT8FV6p2uuqqrvjyu8JGmwJcu9qt49YPj6ReZ/FPjoKKEkSaPxHaqS1CDLXZIaZLlLUoMsd0lqkOUuSQ2y3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDLHdJapDlLkkNstwlqUGWuyQ1yHKXpAZZ7pLUIMtdkhq0ZLknOZjkZJKjfWNnJjmc5OHucks3niQfT3I8yf1JLhhneEnSYMs5cr8BuGTB2H7gtqraBdzWXQe4FNjVfe0DPrE6MSVJK7FkuVfVHcDTC4Z3A4e65UPA5X3jn6yeO4HNSc5erbCSpOUZ9pz7q6rqBEB3eVY3vg14om/ebDf2Akn2JZlJMjM3NzdkDEnSIKv9hGoGjNWgiVV1oKqmq2p6ampqlWNI0o+3Ycv9qfnTLd3lyW58Fjinb9524Mnh40mShjFsud8K7O2W9wK39I2/p3vVzEXAs/OnbyRJk7NpqQlJbgTeAmxNMgtcA1wL3JTkSuBx4F3d9C8AlwHHge8A7x1DZknSEpYs96p69ylWXTxgbgFXjRpKkjQa36EqSQ2y3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDLHdJapDlLkkNstwlqUGWuyQ1yHKXpAZZ7pLUIMtdkhpkuUtSgyx3SWqQ5S5JDbLcJalBI5V7kl9L8mCSo0luTPLiJDuTHEnycJLPJDl9tcJKkpZn6HJPsg34N8B0Vb0WOA3YA1wHfKyqdgHfBq5cjaCSpOUb9bTMJuAlSTYBLwVOAG8Fbu7WHwIuH3EbkqQVGrrcq+qbwH8AHqdX6s8CdwPPVNXz3bRZYNug2yfZl2Qmyczc3NywMSRJA4xyWmYLsBvYCfwU8DLg0gFTa9Dtq+pAVU1X1fTU1NSwMSRJA4xyWuYXgUeraq6qvgd8FvgnwObuNA3AduDJETNKklZolHJ/HLgoyUuTBLgY+CpwO3BFN2cvcMtoESVJKzXKOfcj9J44vQd4oLuvA8CHgV9Pchx4JXD9KuSUJK3ApqWnnFpVXQNcs2D4EeDCUe5XkjQa36EqSQ2y3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDLHdJapDlLkkNstwlqUGWuyQ1yHKXpAZZ7pLUIMtdkhpkuUtSgyx3SWqQ5S5JDbLcJalBI5V7ks1Jbk7ytSTHkvzjJGcmOZzk4e5yy2qFlSQtz6hH7v8Z+F9V9RrgHwHHgP3AbVW1C7ituy5JmqChyz3JTwBvBq4HqKrvVtUzwG7gUDftEHD5qCElSSszypH7ecAc8MdJvpLkj5K8DHhVVZ0A6C7PGnTjJPuSzCSZmZubGyGGJGmhUcp9E3AB8ImqegPw/1nBKZiqOlBV01U1PTU1NUIMSdJCo5T7LDBbVUe66zfTK/unkpwN0F2eHC2iJGmlhi73qvor4IkkP9sNXQx8FbgV2NuN7QVuGSmhJGnFNo14+w8Cn0pyOvAI8F56/2HclORK4HHgXSNuQ5K0QiOVe1XdC0wPWHXxKPcrSRqN71CVpAZZ7pLUIMtdkhpkuUtSgyx3SWqQ5S5JDbLcJalBlrskNchyl6QGWe6S1CDLXZIaZLlLUoMsd0lqkOUuSQ2y3CWpQZa7JDXIcpekBlnuktSgkcs9yWlJvpLkf3bXdyY5kuThJJ/p/r6qJGmCVuPI/UPAsb7r1wEfq6pdwLeBK1dhG5KkFRip3JNsB94B/FF3PcBbgZu7KYeAy0fZhiRp5UY9cv9PwG8BP+iuvxJ4pqqe767PAtsG3TDJviQzSWbm5uZGjCFJ6jd0uSf5ZeBkVd3dPzxgag26fVUdqKrpqpqempoaNoYkaYBNI9z2TcA7k1wGvBj4CXpH8puTbOqO3rcDT44eU5K0EkMfuVfV1VW1vap2AHuAL1bVvwBuB67opu0Fbhk5pSRpRcbxOvcPA7+e5Di9c/DXj2EbkqRFjHJa5oeq6kvAl7rlR4ALV+N+JUnD8R2qktQgy12SGrQqp2U0WTv2f36tI0zcY9e+Y60jSBuKR+6S1CDLXZIaZLlLUoMsd0lqkOUuSQ2y3CWpQb4UcgQ/ji9JlLQxeOQuSQ2y3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDhi73JOckuT3JsSQPJvlQN35mksNJHu4ut6xeXEnScozyJqbngd+oqnuSvAK4O8lh4F8Bt1XVtUn2A/vp/V1VSSuwVm+S87Pz2zD0kXtVnaiqe7rlvwGOAduA3cChbtoh4PJRQ0qSVmZVzrkn2QG8ATgCvKqqTkDvPwDgrNXYhiRp+UYu9yQvB/4U+NWq+usV3G5fkpkkM3Nzc6PGkCT1Ganck7yIXrF/qqo+2w0/leTsbv3ZwMlBt62qA1U1XVXTU1NTo8SQJC0wyqtlAlwPHKuq3+tbdSuwt1veC9wyfDxJ0jBGebXMm4B/CTyQ5N5u7N8C1wI3JbkSeBx412gRJUkrNXS5V9VfAjnF6ouHvV9J0uh8h6okNchyl6QGWe6S1CDLXZIa5B/IlvRjby3/2P24PsvHI3dJapDlLkkNstwlqUGWuyQ1yHKXpAb5ahltCC2+mkEaJ4/cJalBlrskNchyl6QGWe6S1CDLXZIaZLlLUoMsd0lq0NjKPcklSR5KcjzJ/nFtR5L0QmMp9ySnAb8PXAqcD7w7yfnj2JYk6YXGdeR+IXC8qh6pqu8CnwZ2j2lbkqQFxlXu24An+q7PdmOSpAkY12fLZMBY/b0JyT5gX3f1b5M8NOS2tgLfGvK2k7ZRsm6UnDCBrLluVe5mw+zTXLdxsrJx9uspc474+PrpU60YV7nPAuf0Xd8OPNk/oaoOAAdG3VCSmaqaHvV+JmGjZN0oOWHjZN0oOcGs47AWOcd1Wub/AruS7ExyOrAHuHVM25IkLTCWI/eqej7JB4A/B04DDlbVg+PYliTphcb2ee5V9QXgC+O6/z4jn9qZoI2SdaPkhI2TdaPkBLOOw8RzpqqWniVJ2lD8+AFJatCGLvf19hEHSR5L8kCSe5PMdGNnJjmc5OHucks3niQf77Lfn+SCMWc7mORkkqN9YyvOlmRvN//hJHsnlPO3k3yz26/3Jrmsb93VXc6Hkry9b3zsj40k5yS5PcmxJA8m+VA3vq726yI5191+TfLiJHclua/L+jvd+M4kR7r985nuhRokOaO7frxbv2Op72HMOW9I8mjfPn19Nz75n31Vbcgvek/UfgM4DzgduA84f40zPQZsXTD2u8D+bnk/cF23fBnwZ/TeE3ARcGTM2d4MXAAcHTYbcCbwSHe5pVveMoGcvw385oC553c/9zOAnd3j4bRJPTaAs4ELuuVXAF/vMq2r/bpIznW3X7t98/Ju+UXAkW5f3QTs6cb/APiVbvn9wB90y3uAzyz2PUwg5w3AFQPmT/xnv5GP3DfKRxzsBg51y4eAy/vGP1k9dwKbk5w9rhBVdQfw9IjZ3g4crqqnq+rbwGHgkgnkPJXdwKer6rmqehQ4Tu9xMZHHRlWdqKp7uuW/AY7Reyf2utqvi+Q8lTXbr92++dvu6ou6rwLeCtzcjS/cp/P7+mbg4iRZ5HsYd85TmfjPfiOX+3r8iIMC/iLJ3em9AxfgVVV1Anr/yICzuvH1kH+l2dYy8we6X2cPzp/mWCTPxHN2pwPeQO8Ibt3u1wU5YR3u1ySnJbkXOEmv7L4BPFNVzw/Y7g8zdeufBV45iawLc1bV/D79aLdPP5bkjIU5F+QZW86NXO5LfsTBGnhTVV1A79Mwr0ry5kXmrsf8806Vba0yfwL4GeD1wAngP3bj6yJnkpcDfwr8alX99WJTB4xNLO+AnOtyv1bV96vq9fTe2X4h8HOLbHfNsi7MmeS1wNXAa4A30jvV8uG1yrmRy33JjziYtKp6srs8CXyO3gPzqfnTLd3lyW76esi/0mxrkrmqnur+If0A+EN+9Ov1mudM8iJ6hfmpqvpsN7zu9uugnOt5v3b5ngG+RO8c9eYk8+/L6d/uDzN163+S3mm9iWXty3lJdwqsquo54I9Zw326kct9XX3EQZKXJXnF/DLwNuBol2n+GfC9wC3d8q3Ae7pn0S8Cnp3/VX6CVprtz4G3JdnS/Qr/tm5srBY8F/HP6e3X+Zx7uldM7AR2AXcxocdGd273euBYVf1e36p1tV9PlXM97tckU0k2d8svAX6R3nMEtwNXdNMW7tP5fX0F8MXqPVN5qu9hnDm/1vefeug9L9C/Tyf7s1+NZ2XX6oveM9Bfp3dO7iNrnOU8es/O3wc8OJ+H3vm/24CHu8sz60fPtv9+l/0BYHrM+W6k96v39+gdLVw5TDbgffSenDoOvHdCOf+ky3F/94/k7L75H+lyPgRcOsnHBvAL9H6Fvh+4t/u6bL3t10Vyrrv9CrwO+EqX6Sjw7/r+fd3V7Z//DpzRjb+4u368W3/eUt/DmHN+sdunR4H/xo9eUTPxn73vUJWkBm3k0zKSpFOw3CWpQZa7JDXIcpekBlnuktQgy12SGmS5S1KDLHdJatDfAeOis0EIh1xDAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "prod[prod['FinishedGoodsFlag'] == 1]['ListPrice'].hist(grid=False, bins=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Filtering on Multiple Conditions\n",
    "\n",
    "Here, we will filter on _multiple conditions_. Before, we filtered on rows where Color was Black. We also filtered where FinishedGoodsFlag was equal to 1. Let's see what happens when we filter on *both* simultaneously. \n",
    "\n",
    "The format for multiple conditions is:\n",
    "\n",
    "`df[ (df['col1'] == value1) & (df['col2'] == value2) ]`\n",
    "\n",
    "Or, more simply:\n",
    "\n",
    "`df[ (CONDITION 1) & (CONDITION 2) ]`\n",
    "\n",
    "Which eventually may evaluate to something like:\n",
    "\n",
    "`df[ True & False ]`\n",
    "\n",
    "...on a row-by-row basis. If the end result is `False`, the row is omitted.\n",
    "\n",
    "_Don't forget parentheses in your conditions!_ This is a common mistake."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>680</th>\n",
       "      <td>HL Road Frame - Black, 58</td>\n",
       "      <td>FR-R92B-58</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>1059.3100</td>\n",
       "      <td>1431.50</td>\n",
       "      <td>58</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>14.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{43DD68D6-14A4-461F-9069-55309D90EA7E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>708</th>\n",
       "      <td>Sport-100 Helmet, Black</td>\n",
       "      <td>HL-U509</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>Black</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>13.0863</td>\n",
       "      <td>34.99</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>S</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>31.0</td>\n",
       "      <td>33.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{A25A44FB-C2DE-4268-958F-110B8D7621E2}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>722</th>\n",
       "      <td>LL Road Frame - Black, 58</td>\n",
       "      <td>FR-R38B-58</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Black</td>\n",
       "      <td>500</td>\n",
       "      <td>375</td>\n",
       "      <td>204.6251</td>\n",
       "      <td>337.22</td>\n",
       "      <td>58</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>L</td>\n",
       "      <td>U</td>\n",
       "      <td>14.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{2140F256-F705-4D67-975D-32DE03265838}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                             NewName ProductNumber  MakeFlag  \\\n",
       "ProductID                                                      \n",
       "680        HL Road Frame - Black, 58    FR-R92B-58         1   \n",
       "708          Sport-100 Helmet, Black       HL-U509         0   \n",
       "722        LL Road Frame - Black, 58    FR-R38B-58         1   \n",
       "\n",
       "           FinishedGoodsFlag  Color  SafetyStockLevel  ReorderPoint  \\\n",
       "ProductID                                                             \n",
       "680                        1  Black               500           375   \n",
       "708                        1  Black                 4             3   \n",
       "722                        1  Black               500           375   \n",
       "\n",
       "           StandardCost  ListPrice Size  ... ProductLine Class  Style  \\\n",
       "ProductID                                ...                            \n",
       "680           1059.3100    1431.50   58  ...          R     H      U    \n",
       "708             13.0863      34.99  NaN  ...          S    NaN    NaN   \n",
       "722            204.6251     337.22   58  ...          R     L      U    \n",
       "\n",
       "           ProductSubcategoryID ProductModelID        SellStartDate  \\\n",
       "ProductID                                                             \n",
       "680                        14.0            6.0  2008-04-30 00:00:00   \n",
       "708                        31.0           33.0  2011-05-31 00:00:00   \n",
       "722                        14.0            9.0  2011-05-31 00:00:00   \n",
       "\n",
       "          SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                 \n",
       "680               NaN               NaN   \n",
       "708               NaN               NaN   \n",
       "722               NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "680        {43DD68D6-14A4-461F-9069-55309D90EA7E}   \n",
       "708        {A25A44FB-C2DE-4268-958F-110B8D7621E2}   \n",
       "722        {2140F256-F705-4D67-975D-32DE03265838}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "680        2014-02-08 10:01:36.827000000  \n",
       "708        2014-02-08 10:01:36.827000000  \n",
       "722        2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Let's look at a table where Color is Black, and FinishedGoodsFlag is 1\n",
    "prod[ (prod['Color'] == 'Black') & (prod['FinishedGoodsFlag'] == 1) ].head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{58AE3C20-4F3A-4749-A7D4-D568806CC537}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>800</td>\n",
       "      <td>600</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                   NewName ProductNumber  MakeFlag  FinishedGoodsFlag Color  \\\n",
       "ProductID                                                                     \n",
       "1          Adjustable Race       AR-5381         0                  0   NaN   \n",
       "2             Bearing Ball       BA-8327         0                  0   NaN   \n",
       "3          BB Ball Bearing       BE-2349         1                  0   NaN   \n",
       "\n",
       "           SafetyStockLevel  ReorderPoint  StandardCost  ListPrice Size  ...  \\\n",
       "ProductID                                                                ...   \n",
       "1                      1000           750           0.0        0.0  NaN  ...   \n",
       "2                      1000           750           0.0        0.0  NaN  ...   \n",
       "3                       800           600           0.0        0.0  NaN  ...   \n",
       "\n",
       "          ProductLine Class  Style  ProductSubcategoryID ProductModelID  \\\n",
       "ProductID                                                                 \n",
       "1                 NaN   NaN    NaN                   NaN            NaN   \n",
       "2                 NaN   NaN    NaN                   NaN            NaN   \n",
       "3                 NaN   NaN    NaN                   NaN            NaN   \n",
       "\n",
       "                 SellStartDate SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                                      \n",
       "1          2008-04-30 00:00:00         NaN               NaN   \n",
       "2          2008-04-30 00:00:00         NaN               NaN   \n",
       "3          2008-04-30 00:00:00         NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "1          {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}   \n",
       "2          {58AE3C20-4F3A-4749-A7D4-D568806CC537}   \n",
       "3          {9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "1          2014-02-08 10:01:36.827000000  \n",
       "2          2014-02-08 10:01:36.827000000  \n",
       "3          2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Here we have an example of a list price of greater than 50, \n",
    "# OR a product size that is not equal to 'XL'\n",
    "\n",
    "prod[ (prod['ListPrice'] > 50) | (prod['Size'] != 'XL') ].head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sorting\n",
    "\n",
    "We can sort one column of our DataFrame as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>749</th>\n",
       "      <td>Road-150 Red, 62</td>\n",
       "      <td>BK-R93R-62</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Red</td>\n",
       "      <td>100</td>\n",
       "      <td>75</td>\n",
       "      <td>2171.2942</td>\n",
       "      <td>3578.27</td>\n",
       "      <td>62</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>2.0</td>\n",
       "      <td>25.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>2012-05-29 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{BC621E1F-2553-4FDC-B22E-5E44A9003569}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>750</th>\n",
       "      <td>Road-150 Red, 44</td>\n",
       "      <td>BK-R93R-44</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Red</td>\n",
       "      <td>100</td>\n",
       "      <td>75</td>\n",
       "      <td>2171.2942</td>\n",
       "      <td>3578.27</td>\n",
       "      <td>44</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>2.0</td>\n",
       "      <td>25.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>2012-05-29 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{C19E1136-5DA4-4B40-8758-54A85D7EA494}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>751</th>\n",
       "      <td>Road-150 Red, 48</td>\n",
       "      <td>BK-R93R-48</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Red</td>\n",
       "      <td>100</td>\n",
       "      <td>75</td>\n",
       "      <td>2171.2942</td>\n",
       "      <td>3578.27</td>\n",
       "      <td>48</td>\n",
       "      <td>...</td>\n",
       "      <td>R</td>\n",
       "      <td>H</td>\n",
       "      <td>U</td>\n",
       "      <td>2.0</td>\n",
       "      <td>25.0</td>\n",
       "      <td>2011-05-31 00:00:00</td>\n",
       "      <td>2012-05-29 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{D10B7CC1-455E-435B-A08F-EC5B1C5776E9}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                    NewName ProductNumber  MakeFlag  FinishedGoodsFlag Color  \\\n",
       "ProductID                                                                      \n",
       "749        Road-150 Red, 62    BK-R93R-62         1                  1   Red   \n",
       "750        Road-150 Red, 44    BK-R93R-44         1                  1   Red   \n",
       "751        Road-150 Red, 48    BK-R93R-48         1                  1   Red   \n",
       "\n",
       "           SafetyStockLevel  ReorderPoint  StandardCost  ListPrice Size  ...  \\\n",
       "ProductID                                                                ...   \n",
       "749                     100            75     2171.2942    3578.27   62  ...   \n",
       "750                     100            75     2171.2942    3578.27   44  ...   \n",
       "751                     100            75     2171.2942    3578.27   48  ...   \n",
       "\n",
       "          ProductLine Class  Style  ProductSubcategoryID ProductModelID  \\\n",
       "ProductID                                                                 \n",
       "749                R     H      U                    2.0           25.0   \n",
       "750                R     H      U                    2.0           25.0   \n",
       "751                R     H      U                    2.0           25.0   \n",
       "\n",
       "                 SellStartDate          SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                                               \n",
       "749        2011-05-31 00:00:00  2012-05-29 00:00:00               NaN   \n",
       "750        2011-05-31 00:00:00  2012-05-29 00:00:00               NaN   \n",
       "751        2011-05-31 00:00:00  2012-05-29 00:00:00               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "749        {BC621E1F-2553-4FDC-B22E-5E44A9003569}   \n",
       "750        {C19E1136-5DA4-4B40-8758-54A85D7EA494}   \n",
       "751        {D10B7CC1-455E-435B-A08F-EC5B1C5776E9}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "749        2014-02-08 10:01:36.827000000  \n",
       "750        2014-02-08 10:01:36.827000000  \n",
       "751        2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[3 rows x 24 columns]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# let's sort by standard cost, descending\n",
    "prod.sort_values(by='StandardCost', ascending=False).head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This one is a little more advanced, but it demonstrates a few things:\n",
    "- Conversion of a `numpy.ndarray` object (return type of `pd.Series.unique()`) into a `pd.Series` object\n",
    "- `pd.Series.sort_values` with the `by=` kwarg omitted (if only one column is the operand, `by=` doesn't need specified\n",
    "- Alphabetical sort of a string field, `ascending=True` means A->Z\n",
    "- Inclusion of nulls, `NaN` in a string field (versus omission with a float/int as prior example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1           Black\n",
       "5            Blue\n",
       "8            Grey\n",
       "6           Multi\n",
       "3             Red\n",
       "2          Silver\n",
       "9    Silver/Black\n",
       "4           White\n",
       "7          Yellow\n",
       "0             NaN\n",
       "dtype: object"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.Series(prod['Color'].unique()).sort_values(ascending=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Independent Exercises\n",
    "\n",
    "Do your best to complete the following prompts. Don't hesitate to look at code we wrote together!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Print the first 4 rows of the whole DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>NewName</th>\n",
       "      <th>ProductNumber</th>\n",
       "      <th>MakeFlag</th>\n",
       "      <th>FinishedGoodsFlag</th>\n",
       "      <th>Color</th>\n",
       "      <th>SafetyStockLevel</th>\n",
       "      <th>ReorderPoint</th>\n",
       "      <th>StandardCost</th>\n",
       "      <th>ListPrice</th>\n",
       "      <th>Size</th>\n",
       "      <th>...</th>\n",
       "      <th>ProductLine</th>\n",
       "      <th>Class</th>\n",
       "      <th>Style</th>\n",
       "      <th>ProductSubcategoryID</th>\n",
       "      <th>ProductModelID</th>\n",
       "      <th>SellStartDate</th>\n",
       "      <th>SellEndDate</th>\n",
       "      <th>DiscontinuedDate</th>\n",
       "      <th>rowguid</th>\n",
       "      <th>ModifiedDate</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Adjustable Race</td>\n",
       "      <td>AR-5381</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{694215B7-08F7-4C0D-ACB1-D734BA44C0C8}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bearing Ball</td>\n",
       "      <td>BA-8327</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1000</td>\n",
       "      <td>750</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{58AE3C20-4F3A-4749-A7D4-D568806CC537}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>BB Ball Bearing</td>\n",
       "      <td>BE-2349</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>800</td>\n",
       "      <td>600</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Headset Ball Bearings</td>\n",
       "      <td>BE-2908</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>800</td>\n",
       "      <td>600</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>...</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2008-04-30 00:00:00</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>{ECFED6CB-51FF-49B5-B06C-7D8AC834DB8B}</td>\n",
       "      <td>2014-02-08 10:01:36.827000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4 rows × 24 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                         NewName ProductNumber  MakeFlag  FinishedGoodsFlag  \\\n",
       "ProductID                                                                     \n",
       "1                Adjustable Race       AR-5381         0                  0   \n",
       "2                   Bearing Ball       BA-8327         0                  0   \n",
       "3                BB Ball Bearing       BE-2349         1                  0   \n",
       "4          Headset Ball Bearings       BE-2908         0                  0   \n",
       "\n",
       "          Color  SafetyStockLevel  ReorderPoint  StandardCost  ListPrice Size  \\\n",
       "ProductID                                                                       \n",
       "1           NaN              1000           750           0.0        0.0  NaN   \n",
       "2           NaN              1000           750           0.0        0.0  NaN   \n",
       "3           NaN               800           600           0.0        0.0  NaN   \n",
       "4           NaN               800           600           0.0        0.0  NaN   \n",
       "\n",
       "           ... ProductLine Class  Style  ProductSubcategoryID ProductModelID  \\\n",
       "ProductID  ...                                                                 \n",
       "1          ...         NaN   NaN    NaN                   NaN            NaN   \n",
       "2          ...         NaN   NaN    NaN                   NaN            NaN   \n",
       "3          ...         NaN   NaN    NaN                   NaN            NaN   \n",
       "4          ...         NaN   NaN    NaN                   NaN            NaN   \n",
       "\n",
       "                 SellStartDate SellEndDate  DiscontinuedDate  \\\n",
       "ProductID                                                      \n",
       "1          2008-04-30 00:00:00         NaN               NaN   \n",
       "2          2008-04-30 00:00:00         NaN               NaN   \n",
       "3          2008-04-30 00:00:00         NaN               NaN   \n",
       "4          2008-04-30 00:00:00         NaN               NaN   \n",
       "\n",
       "                                          rowguid  \\\n",
       "ProductID                                           \n",
       "1          {694215B7-08F7-4C0D-ACB1-D734BA44C0C8}   \n",
       "2          {58AE3C20-4F3A-4749-A7D4-D568806CC537}   \n",
       "3          {9C21AED2-5BFA-4F18-BCB8-F11638DC2E4E}   \n",
       "4          {ECFED6CB-51FF-49B5-B06C-7D8AC834DB8B}   \n",
       "\n",
       "                            ModifiedDate  \n",
       "ProductID                                 \n",
       "1          2014-02-08 10:01:36.827000000  \n",
       "2          2014-02-08 10:01:36.827000000  \n",
       "3          2014-02-08 10:01:36.827000000  \n",
       "4          2014-02-08 10:01:36.827000000  \n",
       "\n",
       "[4 rows x 24 columns]"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod.head(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many rows are in the dataframe? Return the answer as an int."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "504"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many columns? Retrun the answer as an int."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "24"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod.shape[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many different product lines are there?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod['ProductLine'].nunique()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What are the values of these product lines?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([nan, 'R ', 'S ', 'M ', 'T '], dtype=object)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod['ProductLine'].unique()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Do the number of values for the product lines match the number you have using `.nunique()`? Why or why not?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "# your answer here\n",
    "# pd.Series.nunique() does not count nulls, seen as nan in a np.ndarray."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take the output from your previous answer (using `.unique()`). Select the label corresponding to the `Road` product line using list indexing notation. How many characters are in this string?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "R \n",
      "2\n"
     ]
    }
   ],
   "source": [
    "# your answer here\n",
    "print(prod['ProductLine'].unique()[1])\n",
    "print(len(prod['ProductLine'].unique()[1]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Do you notice anything odd about this?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "# your answer here\n",
    "# There is trailing whitespace! The horror!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many products are there for the `Road` product line? Don't forget what you just worked on above! Return your answer as an int."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "100"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod[prod['ProductLine'] == 'R '].shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many products are there in the `Women's` `Mountain` category? Return your answer as an int. _Hint: Use the data dictionary above!_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "11"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod[(prod['ProductLine'] == 'M ') & (prod['Style'] == 'W ')].shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Challenge:** What are the top 3 _most expensive list price_ product that are either in the `Women's` `Mountain` category, _OR_ `Silver` in `Color`? Return your answer as a DataFrame object, with the `ProductID` index, `NewName` relabeled as `Name`, and `ListPrice` columns. Perform the statement in one execution, and do not mutate the source DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>ListPrice</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ProductID</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>774</th>\n",
       "      <td>Mountain-100 Silver, 48</td>\n",
       "      <td>3399.99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>771</th>\n",
       "      <td>Mountain-100 Silver, 38</td>\n",
       "      <td>3399.99</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>772</th>\n",
       "      <td>Mountain-100 Silver, 42</td>\n",
       "      <td>3399.99</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                              Name  ListPrice\n",
       "ProductID                                    \n",
       "774        Mountain-100 Silver, 48    3399.99\n",
       "771        Mountain-100 Silver, 38    3399.99\n",
       "772        Mountain-100 Silver, 42    3399.99"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# your answer here\n",
    "prod[\n",
    "    ((prod['ProductLine'] == 'M ') & (prod['Style'] == 'W ')) | \n",
    "    (prod['Color'] == 'Silver')].sort_values(\n",
    "        by='ListPrice', ascending=False).head(3)[['NewName', 'ListPrice']].rename(columns={'NewName': 'Name'})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recap\n",
    "\n",
    "We covered a lot of ground! It's ok if this takes a while to gel.\n",
    "\n",
    "```python\n",
    "\n",
    "# basic DataFrame operations\n",
    "df.head()\n",
    "df.tail()\n",
    "df.shape\n",
    "df.columns\n",
    "df.Index\n",
    "\n",
    "# selecting columns\n",
    "df.column_name\n",
    "df['column_name']\n",
    "\n",
    "# renaming columns\n",
    "df.rename({'old_name':'new_name'}, inplace=True)\n",
    "df.columns = ['new_column_a', 'new_column_b']\n",
    "\n",
    "# notable columns operations\n",
    "df.describe() # five number summary\n",
    "df['col1'].nunique() # number of unique values\n",
    "df['col1'].value_counts() # number of occurrences of each value in column\n",
    "\n",
    "# filtering\n",
    "df[ df['col1'] < 50 ] # filter column to be less than 50\n",
    "df[ (df['col1'] == value1) & (df['col2'] > value2) ] # filter column where col1 is equal to value1 AND col2 is greater to value 2\n",
    "\n",
    "# sorting\n",
    "df.sort_values(by='column_name', ascending = False) # sort biggest to smallest\n",
    "\n",
    "```\n",
    "\n",
    "\n",
    "It's common to refer back to your own code *all the time.* Don't hesistate to reference this guide! 🐼\n",
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}