You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

292 lines
12 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title></title>
<meta name="description" content="">
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black-translucent" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
<!-- For syntax highlighting -->
<link rel="stylesheet" href="../../../../lib/css/zenburn.css">
<link rel="stylesheet" href="../../../../lib/css/prism.css">
<link rel="stylesheet" href="../../../../css/reveal.css">
<link rel="stylesheet" href="../../../../css/theme/ga-title.css" id="theme">
<!--[if lt IE 9]>
<script src="lib/js/html5shiv.js"></script>
<![endif]-->
<link rel="stylesheet" type="text/css" href="https://s3.amazonaws.com/python-ga/proxima-nova/fonts.css" />
</head>
<body class="language-javascript">
<div class="reveal">
<!-- Any section element inside of this container is displayed as a slide -->
<div class="slides">
<!--
---
title: Plotting with Pandas
type: lesson
duration: "1:00"
---
-->
<section id="section" class="level2 separator">
<h2><img src="http://nagale.com/ga-python/images/GA_Cog_Medium_White_RGB.png" /></h2>
<h1>
Plotting with Pandas
</h1>
<!--
## Overview
This lesson covers plotting with pandas, which serves as a wrapper for matplotlib.
## Learning Objectives
In this lesson, students will:
- Use pandas to plot from three different datasets
## Duration
60 minutes
## Suggested Agenda
| Time | Activity |
| --- | --- |
## Suggested Agenda
| Time | Activity | Purpose |
|-------------|----------|---------|
| 0:00 - 0:03 | Welcome |
| 0:03 - 0:15 | Slides |
| 0:15 - 0:17 | NOTE: Switch to Notebook |
| 0:17 - 0:25 | Line Plots |
| 0:25 - 0:35 | Bar Plots and Histograms |
| 0:35 - 0:44 | Scatter Plots |
| 0:44 - 0:58 | Independent Exercises |
| 0:58 - 1:00 | Summary |
## Materials and Preparation
- Send out the presentation link.
- Students will need the data sets and notebook. Consider having a zip file of all notebooks and data sets for the rest of the unit that you hand out at the beginning of this lesson. Alternatively, link them directly in GitHub - remember that they haven't learned GitHub, so you'll need to help them download the files.
## Differentiation and Extensions
- If students are excelling in the first half, consider deeper discussions surrounding five number summaries, data integrity, off-the-cuff filters and sorts
- If students are struggling, work on the code more heavily than the **Class Questions** portions. Make the Independent Exercises be Collective Exercises (as a class)
## In Class: Materials
- Projector
- Internet connection
- Jupyter Notebooks
- Python3
-->
<hr />
<aside class="notes">
<p><strong>Talking Points</strong>: This lesson introduces the Pandas library and the beginnings of Exploratory Data Analysis. The majority of the lesson should be spent going through code whether that is via Jupyter Slides or a Jupyter Notebook demonstration.</p>
<p>To present this content, begin with <code>04-plotting-with-pandas.ipynb</code> to introduce Pandas as a library and data integrity. Transition to the Jupyter Notebook to introduce reading in data, column manipulation, filtering and sorting; conclude with exercises.</p>
<strong>Teaching Tips</strong>: - There are <strong>Class Questions</strong> littered throughout the notebook. Use as much/little time on these as you see fit relative to how your class is pacing - There is no <strong>Independent Exercise</strong> at the end of this lesson. It is aspirational to have time to let students work entirely independently on this time-wise, so consider doing a guided code-along or paired programming. Use this time to have students set their own challenges. - Pause after learning objectives and level-set for what students will get out of the lesson
</aside>
<hr />
</section>
<section id="a-note-on-delivery" class="level2">
<h2>A Note on Delivery</h2>
<ul>
<li>This units lessons will occur in <a href="http://jupyter.org/">jupyter notebooks</a>
<ul>
<li>Slides will be an introduction to the lesson (no code, just overview)</li>
<li>Then, well open a notebook and start coding!</li>
</ul></li>
</ul>
<aside class="notes">
<p><strong>Teaching Tips</strong>: - We could have made this into a speaker note, but its helpful to get it out there so everybodys on the same page - No repl.it for this unit as well be in notebooks</p>
</aside>
<hr />
</section>
<section id="plotting-with-pandas" class="level2">
<h2>Plotting with Pandas</h2>
<ul>
<li>Pandas <code>.plot()</code> functionality is effectively a wrapper for <a href="https://matplotlib.org/">matplotlib</a></li>
<li>Matplotlib is a charting library for python and scientific computing</li>
<li>Its considered the de-facto standard for charting locally
<ul>
<li>Its best for scientific papers, EDA, and general introspection of data</li>
<li>Its not so great for production level charts that are embedded in applications (check out <a href="https://d3js.org/">d3.js</a></li>
</ul></li>
</ul>
<aside class="notes">
<p><strong>Talking Points</strong>:</p>
<ul>
<li>Talk briefly about where charts are interpreted, and why different tools may be advantageous</li>
</ul>
</aside>
<hr />
</section>
<section id="so-pandas-and-matplotlib" class="level2">
<h2>So, Pandas and Matplotlib</h2>
<p>Whats a wrapper?</p>
<ul>
<li>A program that <em>abstracts</em> another program to modify its interface</li>
</ul>
<p>???</p>
<ul>
<li>Pandas <code>.plot()</code> functionality references matplotlib behind the scenes</li>
<li>Matplotlib has a reputation for being fairly complex
<ul>
<li>Even for fairly simple charts, you will frequently write loops</li>
<li>A fairly plain chart can be 20-30 lines of code</li>
</ul></li>
<li>Pandas helps us here and most charts can be produced with 1-2 lines of code
<ul>
<li>Some functionality is reduced, but <em>effort is minimized in most cases</em></li>
</ul></li>
</ul>
<aside class="notes">
<p><strong>Talking Points</strong>:</p>
<ul>
<li>Encourage students to learn matplotlib on their own time if they wish</li>
<li>Many data science shops use matplotlib as a standard
<ul>
<li>Its a bit older and a little hokey, but its well supported, open source, and generally gets the job done</li>
</ul></li>
<li>Take some time to talk about the balance between package complexity and utility overall - sometimes a good answer delivered on time beats a perfect answer delivered late</li>
</ul>
</aside>
<hr />
</section>
<section id="talk-data-to-me" class="level2">
<h2>Talk Data to Me</h2>
<p>Well be using three data sets for this lesson:</p>
<ul>
<li>Football Records: International football results from 1872 to 2018</li>
<li>Avocado Prices: Historical data on avocado prices and sales volume in multiple US markets</li>
<li>Chocolate Bar Ratings: Expert ratings of over 1,700 chocolate bars</li>
</ul>
<p>All datasets have been graciously downloaded from Kaggle.com, and well discover that the right visualization can often replace a bit of fancy machine learning, if done properly.</p>
<aside class="notes">
<p><strong>Talking Points</strong>:</p>
<ul>
<li>Well be walking through the data sets during the lesson, feel free to refer to the links there if you wish.</li>
</ul>
</aside>
<hr />
</section>
<section id="chart-types" class="level2">
<h2>Chart Types</h2>
<p>Well be covering the following chart types during this lesson:</p>
<ul>
<li>Time series line charts</li>
<li>Categorical bar charts</li>
<li>Histograms of single columns</li>
<li>Histograms of entire data frames</li>
<li>Scatter plots (continuous vs continuous)</li>
<li>Scatter matricies (multiple scatter plots in a grid)</li>
<li>Scatter plots with class colors for data points</li>
</ul>
<aside class="notes">
<p><strong>Teaching Tips</strong>:</p>
<ul>
<li>This is the tip of the iceberg for plots, thats okay</li>
<li>Assure students that the above charts have been selected specifically to cover the majority of cases youll encounter</li>
<li>Take a minute to talk about the common use cases for each of these, as well as the data types they all prefer</li>
</ul>
</aside>
<hr />
</section>
<section id="lets-go" class="level2">
<h2>Lets Go!</h2>
<ul>
<li>Open up your dataset!</li>
</ul>
<aside class="notes">
<p><strong>Teaching Tips</strong>:</p>
<ul>
<li>Make sure everyone gets to the notebook successfully.</li>
<li>Have students assist one another and walk around the room to ensure everyone gets to the notebook successfully</li>
<li>Make sure all students can open and run their Notebooks. Its only the second time theyve done so!</li>
<li>The presentation is also at the top of the Notebook, so students can later reference in one place. Jump down to <code>Importing Pandas</code>.</li>
</ul>
</aside>
<hr />
</section>
</div>
<footer><span class='slide-number'></span></footer>
</div>
<script src="../../../../lib/js/head.min.js"></script>
<script src="../../../../js/reveal.js"></script>
<script>
var dependencies = [
{ src: '../../../../lib/js/classList.js', condition: function() { return !document.body.classList; } },
{ src: '../../../../plugin/markdown/showdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: '../../../../plugin/markdown/markdown.js', condition: function() { return !!document.querySelector( '[data-markdown]' ); } },
{ src: '../../../../plugin/prism/prism.js', async: true, callback: function() { /*hljs.initHighlightingOnLoad();*/ } },
{ src: '../../../../plugin/zoom-js/zoom.js', async: true, condition: function() { return !!document.body.classList; } }
];
if (Reveal.getQueryHash().instructor === 1) {
dependencies.push({ src: '../../../../plugin/notes/notes.js', async: true, condition: function() { return !!document.body.classList; } });
}
// Full list of configuration options available here:
// https://github.com/hakimel/reveal.js#configuration
Reveal.initialize({
controls: true,
progress: true,
history: true,
center: false,
slideNumber: true,
// available themes are in /css/theme
theme: Reveal.getQueryHash().theme || 'default',
// default/cube/page/concave/zoom/linear/fade/none
transition: Reveal.getQueryHash().transition || 'slide',
// Optional libraries used to extend on reveal.js
dependencies: dependencies
});
if (Reveal.getQueryHash().instructor === 1) {
Reveal.configure(dependencies.push({ src: '../../../../plugin/notes/notes.js', async: true, condition: function() { return !!document.body.classList; } }));
}
Reveal.addEventListener('ready', function() {
if (Reveal.getCurrentSlide().classList.contains('separator-subhead')) {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga-subhead.css');
} else if (Reveal.getCurrentSlide().classList.contains('separator')) {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga-title.css')
} else {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga.css');
}
});
Reveal.addEventListener('slidechanged', function(e) {
if (Reveal.getCurrentSlide().classList.contains('separator-subhead')) {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga-subhead.css');
} else if (Reveal.getCurrentSlide().classList.contains('separator')) {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga-title.css')
} else {
document.getElementById('theme').setAttribute('href', '../../../../css/theme/ga.css');
}
});
</script>
</body>
</html>