{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Pandas for EDA\n", "by [@josephofiowa](https://twitter.com/josephofiowa)\n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pandas Unit Lab\n", "\n", "**Woo!** We've made it to the end of our Pandas Unit. Let's put our skills to the test.\n", "\n", "We're going to explore data from some of the top movies according to IMDB. This is a guided question-and-response lab where some areas are specific asks and others are open ended for you to explore.\n", "\n", "# Pandas Unit Lab\n", "\n", "**Woo!** We've made it to the end of our Pandas Unit. Let's put our skills to the test.\n", "\n", "We're going to explore data from some of the top movies according to IMDB. This is a guided question-and-response lab where some areas are specific asks and others are open ended for you to explore.\n", "\n", "#### Important!!!\n", "- **There are two ways to do this lab!**\n", " - The first way is to read in a dataset that _has already been pulled from the API and cleaned for you_ (`movies_rated.csv`). This is the recommended 'first-pass' way to do this lab.\n", " - _After you have completed the lab using the supplied_ `movies_rated.csv`, you can call the API yourself!\n", " - Calling the API yourself takes time! Be prepared to parse lots of JSON, read docs, etc. Consider this a take-home exercise if the students desire.\n", "\n", "In this lab, we will:\n", "- Use `movie_app.py` to obtain relevant moving rating data\n", "- Leverage Pandas to conduct exploratory data analysis, including:\n", " - Assess data integrity\n", " - Create exploratory visualizations\n", " - Produce insights on top actors/actresses across films\n", " \n", "Let's get going!\n", "\n", "In this lab, we will:\n", "- Use `movie_app.py` to obtain relevant moving rating data\n", "- Leverage Pandas to conduct exploratory data analysis, including:\n", " - Assess data integrity\n", " - Create exploratory visualizations\n", " - Produce insights on top actors/actresses across films\n", " \n", "Let's get going!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Dataset\n", "\n", "We'll work with a dataset on the top [IMDB movies](https://www.imdb.com/search/title?count=100&groups=top_1000&sort=user_rating), as rated by IMDB.\n", "\n", "\n", "Specifically, we have a CSV that contains:\n", "- IMDB star rating\n", "- Movie title\n", "- Year\n", "- Content rating\n", "- Genre\n", "- Duration\n", "- Gross\n", "\n", "_[Details available at the above link]_\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import our necessary libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib as plt\n", "import re\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Read in the dataset\n", "\n", "First, read in the dataset, called `movies.csv` into a DataFrame called \"movies.\"" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "movies = pd.read_csv('../data/movies.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check the dataset basics\n", "\n", "Let's first explore our dataset to verify we have what we expect." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print the first five rows." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgross
0The Shawshank Redemption1994RDrama1421963330
1The Godfather1972RCrime17528341469
2The Dark Knight2008PG-13Action1521344258
3The Godfather: Part II1974RCrime202134966411
4Pulp Fiction1994RCrime1541935047
\n", "
" ], "text/plain": [ " title year content_rating genre duration \\\n", "0 The Shawshank Redemption 1994 R Drama 142 \n", "1 The Godfather 1972 R Crime 175 \n", "2 The Dark Knight 2008 PG-13 Action 152 \n", "3 The Godfather: Part II 1974 R Crime 202 \n", "4 Pulp Fiction 1994 R Crime 154 \n", "\n", " gross \n", "0 1963330 \n", "1 28341469 \n", "2 1344258 \n", "3 134966411 \n", "4 1935047 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How many rows and columns are in the datset?" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(79, 6)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What are the column names?" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['title', 'year', 'content_rating', 'genre', 'duration', 'gross'], dtype='object')\n" ] } ], "source": [ "print(movies.columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How many unique genres are there?" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies['genre'].nunique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How many movies are there per genre?" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Crime 16\n", "Drama 14\n", "Action 11\n", "Adventure 9\n", "Drama 7\n", "Biography 5\n", "Animation 5\n", "Comedy 4\n", "Western 3\n", "Mystery 2\n", "Horror 2\n", "Comedy 1\n", "Name: genre, dtype: int64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies['genre'].value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Only run the below cells if you've obtained an [API key!](http://www.omdbapi.com/apikey.aspx)
Otherwise, proceed to the `importing movies_rated.csv` section below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Obtain more data (with an API call)!\n", "\n", "- Let's take advantage of our `OmdbAPI` module (stored in `./OmdbAPI.py`, if you'd like to look under the hood) to obtain data from OMDB API on movie ratings. This will enable us to answer the question: **How do other publication's scores compare to IMDB ratings?** Specifically, where do Rotten Tomato critics most disagree with IMDB reviews? \n", "- Using the OmdbAPI module, we will obtain the `Internet Movie Database`, the `Rotten Tomatoes`, and the `Metacritic` reviews on the top rated IMDB movies. We will store these ratings in new columns in a new `movies_rated` DataFrame. We have also stored the file locally at `./data/movies_rated.csv`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "import OmdbAPI" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# replace e54ad9e7 with your API key\n", "# this may take a minute\n", "movies_rated = OmdbAPI.Omdb(movies, 'e54ad9e7').get_ratings()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
0The Shawshank Redemption1994RDrama14219633309.3/1091%80/100
1The Godfather1972RCrime175283414699.2/1098%100/100
2The Dark Knight2008PG-13Action15213442589.0/1094%82/100
\n", "
" ], "text/plain": [ " title year content_rating genre duration \\\n", "0 The Shawshank Redemption 1994 R Drama 142 \n", "1 The Godfather 1972 R Crime 175 \n", "2 The Dark Knight 2008 PG-13 Action 152 \n", "\n", " gross Internet Movie Database Rotten Tomatoes Metacritic \n", "0 1963330 9.3/10 91% 80/100 \n", "1 28341469 9.2/10 98% 100/100 \n", "2 1344258 9.0/10 94% 82/100 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just in case there were movies that the API was unable to get, let's drop nulls." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "movies_rated.dropna(inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get the ratings in the same float format using an apply function with some regular expressions. Note the use of .copy() when writing and reading from the same dataframe as a best practice." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
0The Shawshank Redemption1994RDrama14219633309.39.18.0
1The Godfather1972RCrime175283414699.29.810.0
2The Dark Knight2008PG-13Action15213442589.09.48.2
\n", "
" ], "text/plain": [ " title year content_rating genre duration \\\n", "0 The Shawshank Redemption 1994 R Drama 142 \n", "1 The Godfather 1972 R Crime 175 \n", "2 The Dark Knight 2008 PG-13 Action 152 \n", "\n", " gross Internet Movie Database Rotten Tomatoes Metacritic \n", "0 1963330 9.3 9.1 8.0 \n", "1 28341469 9.2 9.8 10.0 \n", "2 1344258 9.0 9.4 8.2 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated['Rotten Tomatoes'] = movies_rated['Rotten Tomatoes'].copy().apply(lambda x: float(re.match('\\d{1,}', x)[0])/10)\n", "movies_rated['Internet Movie Database'] = movies_rated['Internet Movie Database'].copy().apply(lambda x: float(re.match('(\\S+)\\/', x)[1]))\n", "movies_rated['Metacritic'] = movies_rated['Metacritic'].copy().apply(lambda x: float(re.match('(\\S+)\\/', x)[1])/10)\n", "movies_rated.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's write the cleaned result to a local file so we don't have to call the API again and risk exceeding our daily limit." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "movies_rated.to_csv('./movies_rated.csv', index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importing `movies_rated.csv`\n", "\n", "If you just called the API in the previous section, you can skip this and proceed to the `exploratory data analysis` section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's read in the cleaned, rated `movies_rated.csv` file, which was included with this repo just in case you couldn't call the API." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
0The Shawshank Redemption1994RDrama14219633309.39.18.0
1The Godfather1972RCrime175283414699.29.810.0
2The Dark Knight2008PG-13Action15213442589.09.48.2
\n", "
" ], "text/plain": [ " title year content_rating genre duration \\\n", "0 The Shawshank Redemption 1994 R Drama 142 \n", "1 The Godfather 1972 R Crime 175 \n", "2 The Dark Knight 2008 PG-13 Action 152 \n", "\n", " gross Internet Movie Database Rotten Tomatoes Metacritic \n", "0 1963330 9.3 9.1 8.0 \n", "1 28341469 9.2 9.8 10.0 \n", "2 1344258 9.0 9.4 8.2 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated = pd.read_csv('../data/movies_rated.csv')\n", "movies_rated.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check our datatypes. Notice anything potentially problematic?" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "title object\n", "year int64\n", "content_rating object\n", "genre object\n", "duration int64\n", "gross int64\n", "Internet Movie Database float64\n", "Rotten Tomatoes float64\n", "Metacritic float64\n", "dtype: object" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated.dtypes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory data analysis\n", "\n", "Let's transition to asking and answering some questions with our data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What are the top five R-Rated movies?\n", "\n", "*hint: Boolean filters needed! Then sorting!*" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
0The Shawshank Redemption1994RDrama14219633309.39.18.0
1The Godfather1972RCrime175283414699.29.810.0
3The Godfather: Part II1974RCrime2021349664119.09.79.0
5Schindler's List1993RBiography1955348584448.99.79.3
7The Good, the Bad and the Ugly1966RWestern178573000008.99.79.0
\n", "
" ], "text/plain": [ " title year content_rating genre \\\n", "0 The Shawshank Redemption 1994 R Drama \n", "1 The Godfather 1972 R Crime \n", "3 The Godfather: Part II 1974 R Crime \n", "5 Schindler's List 1993 R Biography \n", "7 The Good, the Bad and the Ugly 1966 R Western \n", "\n", " duration gross Internet Movie Database Rotten Tomatoes Metacritic \n", "0 142 1963330 9.3 9.1 8.0 \n", "1 175 28341469 9.2 9.8 10.0 \n", "3 202 134966411 9.0 9.7 9.0 \n", "5 195 534858444 8.9 9.7 9.3 \n", "7 178 57300000 8.9 9.7 9.0 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated[movies_rated.content_rating == 'R'].sort_values(by='Internet Movie Database', ascending=False).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is the average Rotten Tomato score for the top IMDB films?" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9.087341772151897" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated['Rotten Tomatoes'].mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is the Five Number Summary like for top rated films as per IMDB? Is it skewed?" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count 79.000000\n", "mean 8.537975\n", "std 0.222056\n", "min 8.300000\n", "25% 8.400000\n", "50% 8.500000\n", "75% 8.600000\n", "max 9.300000\n", "Name: Internet Movie Database, dtype: float64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated['Internet Movie Database'].describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The average is *slightly* higher than the median, so there's a small positive skew." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create your own question...then answer it!" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yeardurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
year1.0000000.145930-0.107644-0.044124-0.479430-0.487070
duration0.1459301.0000000.0980060.416829-0.088653-0.020531
gross-0.1076440.0980061.0000000.146099-0.019891-0.038350
Internet Movie Database-0.0441240.4168290.1460991.0000000.0620150.261009
Rotten Tomatoes-0.479430-0.088653-0.0198910.0620151.0000000.765957
Metacritic-0.487070-0.020531-0.0383500.2610090.7659571.000000
\n", "
" ], "text/plain": [ " year duration gross \\\n", "year 1.000000 0.145930 -0.107644 \n", "duration 0.145930 1.000000 0.098006 \n", "gross -0.107644 0.098006 1.000000 \n", "Internet Movie Database -0.044124 0.416829 0.146099 \n", "Rotten Tomatoes -0.479430 -0.088653 -0.019891 \n", "Metacritic -0.487070 -0.020531 -0.038350 \n", "\n", " Internet Movie Database Rotten Tomatoes Metacritic \n", "year -0.044124 -0.479430 -0.487070 \n", "duration 0.416829 -0.088653 -0.020531 \n", "gross 0.146099 -0.019891 -0.038350 \n", "Internet Movie Database 1.000000 0.062015 0.261009 \n", "Rotten Tomatoes 0.062015 1.000000 0.765957 \n", "Metacritic 0.261009 0.765957 1.000000 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# correlation between star rating and Rotten Tomato rating?\n", "movies_rated.corr()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Challenge:** Create a dataframe that is the ratio between Rotten Tomato rating vs IMDB rating. What film has the highest IMDB : Rotten Tomato ratio? The lowest?\n", "\n", "*[skip this if you are low on time]*" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Ratings Ratio
01.021978
10.938776
20.957447
\n", "
" ], "text/plain": [ " Ratings Ratio\n", "0 1.021978\n", "1 0.938776\n", "2 0.957447" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rr = pd.DataFrame(movies_rated['Internet Movie Database'] / movies_rated['Rotten Tomatoes'], columns=['Ratings Ratio'])\n", "rr.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top 3 ratings ratio movies (rated higher on IMBD compared to Rotten Tomatoes)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacriticRatings Ratio
12Forrest Gump1994PG-13Drama14214011648.87.28.21.222222
19Interstellar2014PG-13Adventure1693155447508.67.17.41.211268
42The Intouchables2011RBiography11210596548.57.45.71.148649
\n", "
" ], "text/plain": [ " title year content_rating genre duration gross \\\n", "12 Forrest Gump 1994 PG-13 Drama 142 1401164 \n", "19 Interstellar 2014 PG-13 Adventure 169 315544750 \n", "42 The Intouchables 2011 R Biography 112 1059654 \n", "\n", " Internet Movie Database Rotten Tomatoes Metacritic Ratings Ratio \n", "12 8.8 7.2 8.2 1.222222 \n", "19 8.6 7.1 7.4 1.211268 \n", "42 8.5 7.4 5.7 1.148649 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated.merge(rr, left_index=True, right_index=True).sort_values('Ratings Ratio', ascending=False).head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bottom 3 ratings ratio movies (rated lower on IMBD compared to Rotten Tomatoes)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacriticRatings Ratio
66Toy Story 32010RAnimation1034994688.39.99.20.838384
74L.A. Confidential1997RCrime138131822818.39.99.00.838384
63Toy Story1995RAnimation81834715118.310.09.50.830000
\n", "
" ], "text/plain": [ " title year content_rating genre duration gross \\\n", "66 Toy Story 3 2010 R Animation 103 499468 \n", "74 L.A. Confidential 1997 R Crime 138 13182281 \n", "63 Toy Story 1995 R Animation 81 83471511 \n", "\n", " Internet Movie Database Rotten Tomatoes Metacritic Ratings Ratio \n", "66 8.3 9.9 9.2 0.838384 \n", "74 8.3 9.9 9.0 0.838384 \n", "63 8.3 10.0 9.5 0.830000 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies_rated.merge(rr, left_index=True, right_index=True).sort_values('Ratings Ratio', ascending=False).tail(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory data analysis with visualizations\n", "\n", "For each of these prompts, create a plot to visualize the answer. Consider what plot is *most appropriate* to explore the given prompt.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is the relationship between IMDB ratings and Rotten Tomato ratings?" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "movies_rated.plot(kind='scatter', x='Internet Movie Database', y='Rotten Tomatoes')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What is the relationship between IMDB rating and movie duration?" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "movies_rated.plot(kind='scatter', x='duration', y='Internet Movie Database')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How many movies are there in each genre category? (Remember to create a plot here)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "movies_rated['genre'].value_counts().plot(kind='bar', color='dodgerblue')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What does the distribution of Rotten Tomatoes ratings look like?" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAD8CAYAAAB6paOMAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAEKhJREFUeJzt3XuQJWV9xvHvIxhhjQrI4AVYF1IUaigpcbRQEzQihoiCGpNAaQLeNpYaL0lVXJOUmlSlgolRY0xFV0URFRW8oYC64oWkSsAFURcWgwoigrJKSrxFRH/54/TqOMwyPZdzembe76dq6nT36en393Kmefbt7tOdqkKS1K47DV2AJGlYBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcbsPXUAf++67b23YsGHoMiRpVbn00ku/W1VT8623KoJgw4YNbN26degyJGlVSfKNPut5aEiSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhq3Kr5ZLEnjsGHTucu6vWtPPW5ZtzcpjggkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGjS0IkpyW5KYk22Ys+5ckVyX5UpIPJtlrXO1LkvoZ54jg7cCxs5ZtAQ6rqgcB/wO8bIztS5J6GFsQVNWFwM2zln2iqm7rZi8CDhhX+5KkfoY8R/BM4PwB25ckMVAQJPlb4DbgXXewzsYkW5Ns3bFjx+SKk6TGTDwIkpwMPAF4WlXVrtarqs1VNV1V01NTU5MrUJIaM9EnlCU5Fngp8Kiq+vEk25YkzW2cl4+eCXwOODTJ9UmeBbwBuBuwJcnlSd44rvYlSf2MbURQVSfNsfit42pPkrQ4frNYkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklq3ETvPiqpHRs2nbvs27z21OOWfZtyRCBJzTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDVubEGQ5LQkNyXZNmPZPkm2JLm6e917XO1LkvoZ54jg7cCxs5ZtAi6oqkOAC7p5SdKAxhYEVXUhcPOsxScAp3fTpwNPGlf7kqR+Jn2O4F5VdSNA97rfhNuXJM2yYp9HkGQjsBFg/fr1A1cjrX3jeH5Aa1brMxgmPSL4TpL7AHSvN+1qxaraXFXTVTU9NTU1sQIlqTWTDoJzgJO76ZOBD0+4fUnSLOO8fPRM4HPAoUmuT/Is4FTgmCRXA8d085KkAY3tHEFVnbSLt44eV5uSpIXzm8WS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjesVBEkOG3chkqRh9B0RvDHJJUmel2SvsVYkSZqoXkFQVb8DPA04ENia5N1Jjllso0lekuSKJNuSnJlkj8VuS5K0NL3PEVTV1cDfAS8FHgW8PslVSZ6ykAaT7A+8EJiuqsOA3YATF7INSdLy6XuO4EFJXgtsBx4DPLGqHtBNv3YR7e4O7Jlkd2AdcMMitiFJWgZ9RwRvAC4DDq+q51fVZQBVdQOjUUJvVfUt4NXAdcCNwPer6hOz10uyMcnWJFt37NixkCYkSQvQNwgeD7y7qn4CkOROSdYBVNUZC2kwyd7ACcBBwH2BuyZ5+uz1qmpzVU1X1fTU1NRCmpAkLUDfIPgksOeM+XXdssV4LHBNVe2oqp8BHwAeschtSZKWqG8Q7FFVP9w5002vW2Sb1wFHJlmXJMDRjM49SJIG0DcIfpTkiJ0zSR4C/GQxDVbVxcDZjM45fLmrYfNitiVJWrrde673YuCsJDuv7rkP8CeLbbSqXgG8YrG/L0laPr2CoKo+n+T+wKFAgKu64/uSpFWu74gA4KHAhu53HpyEqnrHWKqSJE1MryBIcgbwW8DlwM+7xQUYBJK0yvUdEUwDD6yqGmcxkqTJ63vV0Dbg3uMsRJI0jL4jgn2BK5NcAvx058KqOn4sVUmSJqZvELxynEVIkobT9/LRzya5H3BIVX2yu8/QbuMtTZI0CX1vQ/0cRt8GflO3aH/gQ+MqSpI0OX1PFj8feCRwC/zyITX7jasoSdLk9D1H8NOqunV0jzjoHijjpaSSJmrDpnOHLmFN6jsi+GySv2H0VLFjgLOAj4yvLEnSpPQNgk3ADkZ3C/1z4DwW+GQySdLK1PeqoV8Ab+5+JElrSN97DV3DHOcEqurgZa9IkjRRC7nX0E57AH8E7LP85UiSJq3XOYKq+t6Mn29V1euAx4y5NknSBPQ9NHTEjNk7MRoh3G0sFUmSJqrvoaF/nTF9G3At8MfLXo0kaeL6XjX0e+MuRJI0jL6Hhv7yjt6vqtcsTzmSpElbyFVDDwXO6eafCFwIfHMcRUmSJmchD6Y5oqp+AJDklcBZVfXscRUmSZqMvreYWA/cOmP+VmDDYhtNsleSs5NclWR7kocvdluSpKXpOyI4A7gkyQcZfcP4ycA7ltDuvwEfq6qnJvkNYN0StiVJWoK+Vw39Y5Lzgd/tFj2jqr6wmAaT3B04Cjil2/at/PpoQ5I0QX1HBDD6V/stVfW2JFNJDqqqaxbR5sGM7mT6tiSHA5cCL6qqH81cKclGYCPA+vXrF9GMtHKM4z7615563LJvU23q+6jKVwAvBV7WLboz8M5Ftrk7cATwn1X1YOBHjG5z/WuqanNVTVfV9NTU1CKbkiTNp+/J4icDxzP6nzZVdQOLv8XE9cD1VXVxN382o2CQJA2gbxDcWlVFdyvqJHddbINV9W3gm0kO7RYdDVy52O1Jkpam7zmC9yV5E7BXkucAz2RpD6n5C+Bd3RVDXweesYRtSZKWoO9VQ6/unlV8C3Ao8PKq2rLYRqvqcn79GQeSpIHMGwRJdgM+XlWPBRb9P39J0so07zmCqvo58OMk95hAPZKkCet7juD/gC8n2UJ35RBAVb1wLFVJkiambxCc2/1IktaYOwyCJOur6rqqOn1SBUmSJmu+cwQf2jmR5P1jrkWSNID5giAzpg8eZyGSpGHMFwS1i2lJ0hox38niw5PcwmhksGc3TTdfVXX3sVYnSRq7OwyCqtptUoVIkobR96ZzkqQ1yiCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaN1gQJNktyReSfHSoGiRJw44IXgRsH7B9SRIDBUGSA4DjgLcM0b4k6VeGGhG8Dvhr4BcDtS9J6sz3hLJll+QJwE1VdWmSR9/BehuBjQDr16+fUHWay4ZN5y77Nq899bhl32ZrxvG5qE1DjAgeCRyf5FrgPcBjkrxz9kpVtbmqpqtqempqatI1SlIzJh4EVfWyqjqgqjYAJwKfqqqnT7oOSdKI3yOQpMZN/BzBTFX1GeAzQ9YgSa1zRCBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDVu0HsNSctlue/N7/MS1BJHBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMmHgRJDkzy6STbk1yR5EWTrkGS9CtD3HTuNuCvquqyJHcDLk2ypaquHKAWSWrexEcEVXVjVV3WTf8A2A7sP+k6JEkjg54jSLIBeDBw8ZB1SFLLBnseQZLfBN4PvLiqbpnj/Y3ARoD169cvup3lvk+92uDfjVoyyIggyZ0ZhcC7quoDc61TVZurarqqpqempiZboCQ1ZIirhgK8FdheVa+ZdPuSpF83xIjgkcCfAo9Jcnn38/gB6pAkMcA5gqr6byCTbleSNDe/WSxJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDVusOcRqG3e719aORwRSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjRskCJIcm+QrSb6aZNMQNUiSRiYeBEl2A/4D+APggcBJSR446TokSSNDjAgeBny1qr5eVbcC7wFOGKAOSRLDBMH+wDdnzF/fLZMkDWCI5xFkjmV1u5WSjcDGbvaHSb4ya5V9ge8uc21DWmv9gbXXp7XWH1h7fVpr/SGvWlKf7tdnpSGC4HrgwBnzBwA3zF6pqjYDm3e1kSRbq2p6+csbxlrrD6y9Pq21/sDa69Na6w9Mpk9DHBr6PHBIkoOS/AZwInDOAHVIkhhgRFBVtyV5AfBxYDfgtKq6YtJ1SJJGBnlmcVWdB5y3xM3s8rDRKrXW+gNrr09rrT+w9vq01voDE+hTqm53nlaS1BBvMSFJjVvRQZDk0CSXz/i5JcmLZ62TJK/vblfxpSRHDFXvfHr259FJvj9jnZcPVW8fSV6S5Iok25KcmWSPWe/fJcl7u8/n4iQbhqm0vx59OiXJjhmf0bOHqrWPJC/q+nLF7L+37v1Vsw/t1KNPK34/SnJakpuSbJuxbJ8kW5Jc3b3uvYvfPblb5+okJy+5mKpaFT+MTix/G7jfrOWPB85n9P2EI4GLh651if15NPDRoevr2Yf9gWuAPbv59wGnzFrnecAbu+kTgfcOXfcy9OkU4A1D19qzP4cB24B1jM4JfhI4ZNY6q2of6tmnFb8fAUcBRwDbZiz7Z2BTN70JeNUcv7cP8PXude9ueu+l1LKiRwSzHA18raq+MWv5CcA7auQiYK8k95l8eQu2q/6sNrsDeybZndGOOfs7IScAp3fTZwNHJ5nrS4UryXx9Wk0eAFxUVT+uqtuAzwJPnrXOatuH+vRpxauqC4GbZy2eub+cDjxpjl/9fWBLVd1cVf8LbAGOXUotqykITgTOnGP5ar1lxa76A/DwJF9Mcn6S355kUQtRVd8CXg1cB9wIfL+qPjFrtV9+Pt1O+33gnpOscyF69gngD7vDKGcnOXCO91eKbcBRSe6ZZB2jf/3Prne17UN9+gSrZD+a5V5VdSNA97rfHOss++e1KoKg++LZ8cBZc709x7IVfSnUPP25jNHhosOBfwc+NMnaFqI7fnkCcBBwX+CuSZ4+e7U5fnXFfj49+/QRYENVPYjRYYnTWaGqajvwKkb/avwY8EXgtlmrrarPqGefVs1+tAjL/nmtiiBgdMvqy6rqO3O81+uWFSvMLvtTVbdU1Q+76fOAOyfZd9IF9vRY4Jqq2lFVPwM+ADxi1jq//Hy6Qy334PbD4ZVk3j5V1feq6qfd7JuBh0y4xgWpqrdW1RFVdRSj//ZXz1pl1e1D8/Vple1HM31n52G57vWmOdZZ9s9rtQTBSez6MMo5wJ91Vz4cyWgof+PkSluUXfYnyb13HkNP8jBGn9H3JljbQlwHHJlkXVfz0cD2WeucA+y8quGpwKeqO+O1Qs3bp1nHz4+f/f5Kk2S/7nU98BRu/7e36vah+fq0yvajmWbuLycDH55jnY8Dj0uydzeCfVy3bPGGPnPe48z6OkYf4D1mLHsu8NxuOowedPM14MvA9NA1L7E/LwCuYDTcvQh4xNA1z9OfvweuYnTc9gzgLsA/AMd37+/B6BDYV4FLgIOHrnkZ+vRPMz6jTwP3H7rmefrzX8CVXb1Hz/E3t6r2oZ59WvH7EaPwuhH4GaN/5T+L0fmzCxiNcC4A9unWnQbeMuN3n9ntU18FnrHUWvxmsSQ1brUcGpIkjYlBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4/4fbTG3bcqrlZgAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "movies_rated['Rotten Tomatoes'].plot(kind='hist', bins=15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bonus\n", "\n", "There are many things left unexplored! Consider investigating something about gross revenue and genres." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEJCAYAAACZjSCSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAAERVJREFUeJzt3XuMZnV9x/H3h4tyUURkoBsuXTUEIaYKXdGW1lYQgzfARqzEGmKo2NRaCE0sElNskyY2qaJNWxXBuuIFBUSoteqK4CWxwi5QQcGCCLguZdcLAdRKwW//eM6aKe7unFn2PGee+b1fyeQ558x5nvPJZHc+c26/k6pCktSuncYOIEkal0UgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJatwuYwfoY999962VK1eOHUOSZsq6det+UFVzC603E0WwcuVK1q5dO3YMSZopSe7qs56HhiSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEzcWfxY/Hr796xn3fXGTv28yRpbO4RSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGDToMdZI7gQeAR4CHq2pVkn2AjwMrgTuBV1XVj4fMIUnaumnsEbygqp5dVau6+bOBq6rqEOCqbl6SNJIxDg2dCKzuplcDJ42QQZLUGboICvh8knVJTu+W7V9V9wB0r/sNnEGStA1DP6ry6KrakGQ/YE2SW/u+sSuO0wEOPvjgofJJUvMG3SOoqg3d60bgcuAo4N4kKwC6141bee/5VbWqqlbNzc0NGVOSmjZYESTZM8kTN08DLwJuBq4ETu1WOxW4YqgMkqSFDXloaH/g8iSbt/PRqvpskuuATyQ5DbgbOHnADJKkBQxWBFV1B/CsLSz/IXDsUNuVJC2OdxZLUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDVu8CJIsnOSG5J8upt/apKvJ7ktyceTPG7oDJKkrZvGHsEZwC3z5v8OOK+qDgF+DJw2hQySpK0YtAiSHAi8FLigmw9wDHBpt8pq4KQhM0iStm3oPYJ3AW8GftHNPwW4r6oe7ubXAwds6Y1JTk+yNsnaTZs2DRxTkto1WBEkeRmwsarWzV+8hVVrS++vqvOralVVrZqbmxskoyQJdhnws48GTkjyEmA3YC8mewh7J9ml2ys4ENgwYAZJ0gIG2yOoqrdU1YFVtRJ4NfDFqnoNcDXwym61U4ErhsogSVrYGPcR/CVwVpLbmZwzuHCEDJKkzpCHhn6pqq4Brumm7wCOmsZ2JUkL885iSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDWuVxEkeebQQSRJ4+i7R/DeJNcm+dMkew+aSJI0Vb2KoKp+B3gNcBCwNslHkxw3aDJJ0lT0PkdQVbcBb2XyYJnfA/4hya1J/mCocJKk4fU9R/AbSc4DbgGOAV5eVYd10+cNmE+SNLC+Tyj7R+D9wDlV9bPNC6tqQ5K3DpJMkjQVfYvgJcDPquoRgCQ7AbtV1U+r6qLB0kmSBtf3HMEXgN3nze/RLZMkzbi+RbBbVT24eaab3mOYSJKkaepbBD9JcuTmmSS/CfxsG+tLkmZE33MEZwKXJNnQza8A/nCYSJKkaepVBFV1XZJnAIcCAW6tqv8dNJkkaSr67hEAPAdY2b3niCRU1YcGSSVJmppeRZDkIuDpwI3AI93iAiwCSZpxffcIVgGHV1UNGUaSNH19rxq6Gfi1IYNIksbRd49gX+BbSa4Ffr55YVWdsLU3JNkN+DLw+G47l1bVuUmeClwM7ANcD7y2qh7azvySpMeobxG8bTs+++fAMVX1YJJdga8m+XfgLOC8qro4yXuB04D3bMfnS5J2gL7PI/gScCewazd9HZO/5rf1npp3N/Ku3VcxGbH00m75auCkxceWJO0ofYehfj2TX97v6xYdAHyqx/t2TnIjsBFYA3wHuK+qHu5WWd99liRpJH1PFr8ROBq4H375kJr9FnpTVT1SVc8GDgSOAg7b0mpbem+S05OsTbJ206ZNPWNKkharbxH8fP4J3SS7sJVf4FtSVfcB1wDPA/bu3g+TgtiwlfecX1WrqmrV3Nxc301JkhapbxF8Kck5wO7ds4ovAf51W29IMrf5QfdJdgdeyOQJZ1cDr+xWOxW4YnuCS5J2jL5FcDawCbgJeAPwGSbPL96WFcDVSb7B5OTymqr6NJNnHp+V5HbgKcCF2xNckrRj9B107hdMHlX5/r4fXFXfAI7YwvI7mJwvkCQtAX3HGvouWzgnUFVP2+GJJElTtZixhjbbDTiZyZ3BkqQZ1/eGsh/O+/p+Vb2LyY1hkqQZ1/fQ0JHzZndisofwxEESSZKmqu+hoXfMm36YyXATr9rhaSRJU9f3qqEXDB1EkjSOvoeGztrW96vqnTsmjiRp2hZz1dBzgCu7+ZczedbA94YIJUmansU8mObIqnoAIMnbgEuq6o+HCiZJmo6+Q0wcDMx/ithDwModnkaSNHV99wguAq5NcjmTO4xfAXxosFSSpKnpe9XQ33aPmfzdbtHrquqG4WJJkqal76EhgD2A+6vq3cD67iH0kqQZ1/dRlecyGT76Ld2iXYEPDxVKkjQ9ffcIXgGcAPwEoKo24BATkrQs9C2Ch6qq6IaiTrLncJEkSdPUtwg+keR9TJ43/HrgCyziITWSpKWr71VDf989q/h+4FDgr6pqzaDJJElTsWARJNkZ+FxVvRDwl78kLTMLHhqqqkeAnyZ50hTySJKmrO+dxf8D3JRkDd2VQwBV9eeDpJIkTU3fIvi37kuStMxsswiSHFxVd1fV6mkFkiRN10LnCD61eSLJZQNnkSSNYKEiyLzppw0ZRJI0joWKoLYyLUlaJhY6WfysJPcz2TPYvZumm6+q2mvQdJKkwW2zCKpq52kFkSSNYzHPI5AkLUODFUGSg5JcneSWJN9Mcka3fJ8ka5Lc1r0+eagMkqSFDblH8DDwF1V1GPA84I1JDgfOBq6qqkOAq7p5SdJIBiuCqrqnqq7vph8AbgEOAE4ENt+gtho4aagMkqSFTeUcQZKVwBHA14H9q+oemJQFsN9W3nN6krVJ1m7atGkaMSWpSYMXQZInAJcBZ1bV/Qutv1lVnV9Vq6pq1dzc3HABJalxgxZBkl2ZlMBHquqT3eJ7k6zovr8C2DhkBknStg151VCAC4Fbquqd8751JXBqN30qcMVQGSRJC+s7DPX2OBp4LZPnGNzYLTsHeDuTZyCfBtwNnDxgBknSAgYrgqr6Kv9/0Lr5jh1qu5KkxfHOYklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDVusCJI8oEkG5PcPG/ZPknWJLmte33yUNuXJPUz5B7BB4HjH7XsbOCqqjoEuKqblySNaLAiqKovAz961OITgdXd9GrgpKG2L0nqZ9rnCPavqnsAutf9prx9SdKjLNmTxUlOT7I2ydpNmzaNHUeSlq1pF8G9SVYAdK8bt7ZiVZ1fVauqatXc3NzUAkpSa6ZdBFcCp3bTpwJXTHn7kqRHGfLy0Y8BXwMOTbI+yWnA24HjktwGHNfNS5JGtMtQH1xVp2zlW8cOtU1J0uIt2ZPFkqTpsAgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxgw0xof5+/d079vPuOmPHfp6k5c09AklqnEUgSY2zCCSpcZ4j0II8hyEtb+4RSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOO8jWKQdfU39EGYh41LnvRNqiXsEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXGjXD6a5Hjg3cDOwAVV9fYxcmj5WOqXzA6Rr8VLUpf6Zb1LPd/WTH2PIMnOwD8BLwYOB05Jcvi0c0iSJsY4NHQUcHtV3VFVDwEXAyeOkEOSxDhFcADwvXnz67tlkqQRjHGOIFtYVr+yUnI6cHo3+2CSb2/n9vYFfrCd7x3TLObulTlnTiHJ4sziz5qcOZu5WUI/70X8Wxwl8w74v3Jon5XGKIL1wEHz5g8ENjx6pao6Hzj/sW4sydqqWvVYP2faZjH3LGYGc0/bLOaexcwwyd1nvTEODV0HHJLkqUkeB7wauHKEHJIkRtgjqKqHk/wZ8Dkml49+oKq+Oe0ckqSJUe4jqKrPAJ+Z0uYe8+Glkcxi7lnMDOaetlnMPYuZoWfuVP3KeVpJUkMcYkKSGresiyDJ8Um+neT2JGePnaePJB9IsjHJzWNn6SvJQUmuTnJLkm8mmYnBD5LsluTaJP/Z5f7rsTP1lWTnJDck+fTYWfpKcmeSm5Lc2PdqlqUgyd5JLk1ya/dv/LfGzrSQJId2P+fNX/cnW78YddkeGuqGsvgv4Dgml6xeB5xSVd8aNdgCkjwfeBD4UFU9c+w8fSRZAayoquuTPBFYB5w0Az/rAHtW1YNJdgW+CpxRVf8xcrQFJTkLWAXsVVUvGztPH0nuBFZV1ZK4h6CvJKuBr1TVBd2VjntU1X1j5+qr+134feC5VXXXltZZznsEMzmURVV9GfjR2DkWo6ruqarru+kHgFuYgbvFa+LBbnbX7mvJ/2WU5EDgpcAFY2dZ7pLsBTwfuBCgqh6apRLoHAt8Z2slAMu7CBzKYgRJVgJHAF8fN0k/3SGWG4GNwJqqmoXc7wLeDPxi7CCLVMDnk6zrRg6YBU8DNgH/0h2KuyDJnmOHWqRXAx/b1grLuQh6DWWhHSfJE4DLgDOr6v6x8/RRVY9U1bOZ3OF+VJIlfTguycuAjVW1buws2+HoqjqSycjDb+wOgy51uwBHAu+pqiOAnwAzcb4RoDuUdQJwybbWW85F0GsoC+0Y3TH2y4CPVNUnx86zWN3u/jXA8SNHWcjRwAnd8faLgWOSfHjcSP1U1YbudSNwOZPDt0vdemD9vD3FS5kUw6x4MXB9Vd27rZWWcxE4lMWUdCddLwRuqap3jp2nryRzSfbupncHXgjcOm6qbauqt1TVgVW1ksm/6S9W1R+NHGtBSfbsLiSgO7TyImDJXxlXVf8NfC/J5sHbjgWW9EUQj3IKCxwWgpHuLJ6GWR3KIsnHgN8H9k2yHji3qi4cN9WCjgZeC9zUHW8HOKe7g3wpWwGs7q6q2An4RFXNzOWYM2Z/4PLJ3wzsAny0qj47bqTe3gR8pPuD8g7gdSPn6SXJHkyumnzDgusu18tHJUn9LOdDQ5KkHiwCSWqcRSBJjbMIJKlxFoEkLTGLGXwyycHdoI83JPlGkpcsdnsWgSQtPR+k/82Nb2Vy6fMRTO4t+efFbswikKQlZkuDTyZ5epLPdmM1fSXJMzavDuzVTT+J7RhBYdneUCZJy8z5wJ9U1W1JnsvkL/9jgLcxGczvTcCeTO6QXxSLQJKWuG5Ax98GLunuzgZ4fPd6CvDBqnpH99Cci5I8s6p6j05rEUjS0rcTcF83Uu6jnUZ3PqGqvpZkN2BfJkOr9/5wSdIS1g3r/t0kJ8NkoMckz+q+fTeTwfBIchiwG5NnKPTmWEOStMTMH3wSuBc4F/gi8B4mgyXuClxcVX+T5HDg/cATmJw4fnNVfX5R27MIJKltHhqSpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNe7/ALs5arW5iNykAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "movies_rated['gross'].plot(kind='hist', bins=15, color='dodgerblue')" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titleyearcontent_ratinggenredurationgrossInternet Movie DatabaseRotten TomatoesMetacritic
17One Flew Over the Cuckoo's Nest1975RDrama1336658452728.79.48.0
5Schindler's List1993RBiography1955348584448.99.79.3
13Fight Club1999RDrama1393778459058.87.96.6
\n", "
" ], "text/plain": [ " title year content_rating genre \\\n", "17 One Flew Over the Cuckoo's Nest 1975 R Drama \n", "5 Schindler's List 1993 R Biography \n", "13 Fight Club 1999 R Drama \n", "\n", " duration gross Internet Movie Database Rotten Tomatoes Metacritic \n", "17 133 665845272 8.7 9.4 8.0 \n", "5 195 534858444 8.9 9.7 9.3 \n", "13 139 377845905 8.8 7.9 6.6 " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# top 10 grossing films\n", "movies_rated.sort_values(by='gross', ascending=False).head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 2 }