You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

953 lines
27 KiB

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "1e738e0a-c8ba-4e3d-97d9-f622972990f9"
},
"source": [
"<!--\n",
"If you're creating new material, please share w the community! We'll review it and perform any maintenance. Simply:\n",
"1. Fork your new repo over to `data-staging`\n",
"2. Create an issue and tag `jeffboykin`\n",
"3. *Optional*: Comment on what you made, where you think it should go, or how it should be used!\n",
"\n",
"If you have questions or input on existing materials, get in touch: \n",
"1. Log an issue to this repo to alert us of a problem.\n",
"2. Suggest an edit yourself by forking this repo, making edits, and submitting a pull request with your changes back to our master branch.\n",
"3. Reach out to the data team on Slack and share your thoughts!\n",
"-->\n",
"\n",
"<img src=\"http://imgur.com/1ZcRyrc.png\" style=\"float: left; margin: 20px; height: 55px\">\n",
"\n",
"# Intro to SQLite\n",
"_Author(s): Matt Speck (DC)_"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "e81b4b8b-b40b-4571-9a3f-0d56dbf49978"
},
"source": [
"### LEARNING OBJECTIVES\n",
"*After this lesson, you will be able to:*\n",
"- Connect to a local or remote database using Python or Pandas\n",
"- Perform queries using SELECT\n",
"- Perform simple aggregations COUNT, MAX/MIN/SUM"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "92796084-2dac-40c1-80b6-a343a4daf210"
},
"source": [
"### LESSON GUIDE\n",
"\n",
"- [Opening: Review basic SQL commands](#opening)\n",
"- [Interacting with SQLite from Python](#demo)\n",
" - [Adding data](#adding_data)\n",
" - [Adding data from a .csv file](#csv)\n",
" - [Pandas connector](#pandas)\n",
" - [Writing data into a database](#writing)\n",
" - [Reading data from a database](#reading)\n",
"- [SQL Syntax](#syntax)\n",
" - [SELECT](#select)\n",
" - [WHERE](#where)\n",
" - [Aggregations](#aggregations)\n",
"- [Querying a Database Practice](#ind-practice)\n",
"- [Conclusion](#conclusion)\n",
"- [Additional Resources](#resources)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# Imports cell\n",
"\n",
"import sqlite3\n",
"import pandas as pd\n",
"from pandas.io import sql\n",
"from numpy import genfromtxt"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "dd6114cc-f0b0-4c6f-950f-b02de5e321e3"
},
"source": [
"<a name=\"opening\"></a>\n",
"## Opening (5 mins)\n",
"We have seen how to connect to a local sqlite database and to a remote postgresql database.\n",
"\n",
"**Check:** What SQL commands have we learnt so far?\n",
"\n",
"<details>\n",
"- CREATE \n",
"- INSERT \n",
"- DELETE \n",
"- UPDATE \n",
"- SELECT\n",
"</details>\n",
"\n",
"**Check:** What different commands have we learnt for SQLite and PostgreSQL?\n",
"\n",
"<details>\n",
"- how to list schema and tables\n",
"</details>"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "82ed3d84-4634-4a58-8ba4-2b68f54e40cf"
},
"source": [
"<a name=\"demo\"></a>\n",
"## Interacting with SQLite from Python (30 min)\n",
"\n",
"### Intro: The `sqlite3` package\n",
"\n",
"_Here, the instructor would go over what the SQLite package is, what the class will be using it, and why it is better than just using the command line utility. The instructor should also give some idea of the commands that will be run in this loop of 'I Do, We Do, You Do' aka 'Intro, Demo, Activity.' A short example is provided below._\n",
"\n",
"The command line utility can be useful for basic SQL tasks, but since we're using python for the rest of code it will often be easier to access sqlite directly from within python. We can use the python [`sqlite3`](https://docs.python.org/2.7/library/sqlite3.html) package for just this purpose.\n",
"\n",
"#### Open a connection to an SQLite database file. As before, if the file does not already exist it will automatically be created."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2017-12-18T13:20:24.890098Z",
"start_time": "2017-12-18T13:20:24.877271Z"
},
"collapsed": true,
"focus": false,
"id": "7c1f3f9a-8894-4078-865a-3fd5d48bbeb1"
},
"outputs": [],
"source": [
"sqlite_db = 'test_db.sqlite'\n",
"conn = sqlite3.connect(sqlite_db) \n",
"c = conn.cursor()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "4bcf2724-d441-47af-bd9d-3075037b223d"
},
"source": [
"The syntax to create a table is similar to the console, only now we use the `execute` method of the cursor object `c` that we just created:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2017-12-18T13:20:27.493202Z",
"start_time": "2017-12-18T13:20:27.484446Z"
},
"collapsed": true,
"focus": false,
"id": "3773dca3-dc31-4642-9541-618cf7017ca2"
},
"outputs": [],
"source": [
"c.execute('CREATE TABLE houses (field1 INTEGER PRIMARY KEY, sqft INTEGER, bdrms INTEGER, age INTEGER, price INTEGER);')\n",
"\n",
"# Save (commit) the changes\n",
"conn.commit()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "74cc1a01-9181-4709-9b01-703fea8a5ae1"
},
"source": [
"With the database saved the table should now be viewable using SQLite Manager.\n",
"<a name=\"adding_data\"></a>\n",
"### Demo: Adding data\n",
"\n",
"_Here, the instructor will walk through various methods for adding data in SQLite. In order to make this interactive, the instructor would have this code in their own version of the notebook, and blank cells in the version available to students, so that they can write out the code together._\n",
"\n",
"Since we're back in python, we can now use regular programming techniques in conjunction with the sqlite connection. In particular, the cursor's `execute()` method supports value substitutionusing the `?` character, which makes adding multiple records a bit easier. See the [docs](https://docs.python.org/2.7/library/sqlite3.html) for more details."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true,
"focus": false,
"id": "8f829088-4164-4dd9-9bed-7ecab6d21688"
},
"outputs": [],
"source": [
"last_sale = (None, 4000, 5, 22, 619000)\n",
"c.execute('INSERT INTO houses VALUES (?,?,?,?,?)',last_sale)\n",
"\n",
"# Remember to commit the changes\n",
"conn.commit()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "b2aa87aa-ca5d-44f8-a148-c33909f433cb"
},
"source": [
"Notice that in this syntax we use the python `None` value, rather than `NULL`, to trigger SQLite to auto-increment the Primary Key. \n",
"\n",
"There is a related cursor method `executemany()` which takes an array of tuples and loops through them, substituting one tuple at a time."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true,
"focus": false,
"id": "fc3d6e7e-4cb0-4ae4-bbdf-312dba74d139"
},
"outputs": [],
"source": [
"recent_sales = [\n",
" (None, 2390, 4, 34, 319000),\n",
" (None, 1870, 3, 14, 289000),\n",
" (None, 1505, 3, 90, 269000),\n",
"]\n",
"\n",
"c.executemany('INSERT INTO houses VALUES (?, ?, ?, ?, ?)', recent_sales)\n",
"\n",
"conn.commit()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "d026b9e5-7c9b-49e0-bc0d-22b5038629da"
},
"source": [
"<a name=\"csv\"></a>\n",
"#### Adding data from a csv file\n",
"Next let's load our housing.csv data into an array, and then `INSERT` those records into the database. In this example we'll use the numpy `genfromtxt` function to read the file and parse the contents. "
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true,
"focus": false,
"id": "636476fc-167d-40c4-8ed7-bf4182255678"
},
"outputs": [],
"source": [
"# import into nparray of ints, then convert to list of lists\n",
"data = (genfromtxt('datasets/housing-data.csv', dtype='i8', \n",
" delimiter=',', skip_header=1)).tolist()\n",
"\n",
"# append a None value to beginning of each sub-list\n",
"for d in data:\n",
" d.insert(0, None)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"focus": false,
"id": "81ccf460-0166-45d2-a5cb-b3c6824eba4e"
},
"outputs": [
{
"data": {
"text/plain": [
"[[None, 2104, 3, 70, 399900],\n",
" [None, 1600, 3, 28, 329900],\n",
" [None, 2400, 3, 44, 369000]]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data[0:3]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true,
"focus": false,
"id": "a2e47e19-782b-4ce1-8a2e-7c30ea0df3f1"
},
"outputs": [],
"source": [
"# loop through data, running an INSERT on each record (i.e. sublist)\n",
"for d in data:\n",
" c.execute('INSERT INTO houses VALUES (?, ?, ?, ?, ?)', d)\n",
"\n",
"conn.commit()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "936d3c79-c8f7-4250-aa1d-b9b31712de67"
},
"source": [
"A reason for this example - remember that all elements in a numpy array must be the same data type, so if we want to 'add a None' to each row, we need to work around this. Lists can contain mixed types, so that is one approach.\n",
"\n",
"Still, in this case the value we're adding is the same for all records, so we could have simply used a 'None' in the INSERT statement directly."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Activity: Activity Name\n",
"\n",
"_Here, an instructor would add an activity that students would complete on their own or with classmates. Instructions should have a clear deliverable as well as an amount of time to complete the activity. An example of an activity prompt is below._\n",
"\n",
"Load in the data from `sacramento-housing.csv` and then run a `SELECT` query to show all the houses with four bedrooms. (7 minutes)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2017-12-18T20:33:55.793249Z",
"start_time": "2017-12-18T20:33:55.757817Z"
},
"collapsed": true,
"focus": false,
"id": "ef0c7699-3aa9-4680-b4f7-4051f429fb68"
},
"outputs": [],
"source": [
"## INSERT CODE HERE"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "b52c9aba-41e7-42d2-8aa4-ab02d12157c1"
},
"source": [
"<a name=\"pandas\"></a>\n",
"### Intro: Pandas connector\n",
"\n",
"While databases provide many analytical capabilities, often it's useful to pull the data back into Python for more flexible programming. Large, fixed operations would be more efficient in a database, but Pandas allows for interactive processing.\n",
"\n",
"For example, if you want to aggregate nightly log-ins or sales to present a report or dashboard, this operation is likely not changing and operating on a large dataset. This can run very efficiently in a database rather than by connecting to it with Python.\n",
"\n",
"However, if we want to investigate login or sales data further and ask more interactive questions, then Python would be more practical."
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "b5295e43-86f9-467e-bf1e-2005fc92c2d3"
},
"source": [
"Pandas can connect to most relational databases. In this demonstration, we will create and connect to a SQLite database.\n",
"\n",
"SQLite creates portable SQL databases saved in a single file. These databases are stored in a very efficient manner and allow fast querying, making them ideal for small databases or databases that need to be moved across machines."
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "c213982c-bacd-42e5-a170-e2013daf85fb"
},
"source": [
"<a name=\"writing\"></a>\n",
"### Demo: Writing data into a database\n",
"\n",
"Data in Pandas can be loaded into a relational database. For the most part, Pandas can use column information to infer the schema for the table it creates. For the next demo we will use the Rossmann stores dataset."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"focus": false,
"id": "1df8eabc-c92e-427e-a806-d4388670614e"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sqft</th>\n",
" <th>bdrms</th>\n",
" <th>age</th>\n",
" <th>price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2104</td>\n",
" <td>3</td>\n",
" <td>70</td>\n",
" <td>399900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1600</td>\n",
" <td>3</td>\n",
" <td>28</td>\n",
" <td>329900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2400</td>\n",
" <td>3</td>\n",
" <td>44</td>\n",
" <td>369000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1416</td>\n",
" <td>2</td>\n",
" <td>49</td>\n",
" <td>232000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3000</td>\n",
" <td>4</td>\n",
" <td>75</td>\n",
" <td>539900</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" sqft bdrms age price\n",
"0 2104 3 70 399900\n",
"1 1600 3 28 329900\n",
"2 2400 3 44 369000\n",
"3 1416 2 49 232000\n",
"4 3000 4 75 539900"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = pd.read_csv('datasets/housing-data.csv', low_memory=False)\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "20f948cc-2246-4089-9372-7364764275d2"
},
"source": [
"Data is moved to the database through the `to_sql` command, similar to the `to_csv` command.\n",
"\n",
"`to_sql` takes as arguments:\n",
" - `name`, the table name to create\n",
" - `con`, a connection to a database\n",
" - `index`, whether to input the index column\n",
" - `schema`, if we want to write a custom schema for the new table\n",
" - `if_exists`, what to do if the table already exists. We can overwrite it, add to it, or fail"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": true,
"focus": false,
"id": "214bc50f-94a2-4436-a5c5-579a4489239a"
},
"outputs": [],
"source": [
"data.to_sql('houses_pandas',\n",
" con=conn,\n",
" if_exists='replace',\n",
" index=False)"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "db75a3e1-738d-4d8e-8366-002cc91ef35d"
},
"source": [
"<a name=\"reading\"></a>\n",
"### Demo: Reading data from a database\n",
"\n",
"If we already have data in our database, we can use Pandas to query it. Querying is done through the `read_sql` command in the `sql` module."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"focus": false,
"id": "b3da186e-8401-4ad8-b06d-e00d3c497aea"
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>sqft</th>\n",
" <th>bdrms</th>\n",
" <th>age</th>\n",
" <th>price</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2104</td>\n",
" <td>3</td>\n",
" <td>70</td>\n",
" <td>399900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1600</td>\n",
" <td>3</td>\n",
" <td>28</td>\n",
" <td>329900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>2400</td>\n",
" <td>3</td>\n",
" <td>44</td>\n",
" <td>369000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1416</td>\n",
" <td>2</td>\n",
" <td>49</td>\n",
" <td>232000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>3000</td>\n",
" <td>4</td>\n",
" <td>75</td>\n",
" <td>539900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>1985</td>\n",
" <td>4</td>\n",
" <td>61</td>\n",
" <td>299900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>1534</td>\n",
" <td>3</td>\n",
" <td>12</td>\n",
" <td>314900</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td>1427</td>\n",
" <td>3</td>\n",
" <td>57</td>\n",
" <td>198999</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>1380</td>\n",
" <td>3</td>\n",
" <td>14</td>\n",
" <td>212000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td>1494</td>\n",
" <td>3</td>\n",
" <td>15</td>\n",
" <td>242500</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" sqft bdrms age price\n",
"0 2104 3 70 399900\n",
"1 1600 3 28 329900\n",
"2 2400 3 44 369000\n",
"3 1416 2 49 232000\n",
"4 3000 4 75 539900\n",
"5 1985 4 61 299900\n",
"6 1534 3 12 314900\n",
"7 1427 3 57 198999\n",
"8 1380 3 14 212000\n",
"9 1494 3 15 242500"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sql.read_sql('select * from houses_pandas limit 10', con=conn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Activity: \n",
"\n",
"Write the sacramento housing dataset from the last demo to our `test_db` database. You'll be using the same `conn` that's in the demo. (7 mins)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"## Empty cell for code"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a name=\"syntax\"></a>\n",
"## SQL Syntax\n",
"\n",
"### Intro: SQL Operators\n",
"> Instructor Note: Each of the following can be demoed in pandas using the data we've setup above. A demo and check are included for each, but it is up to the instructor whether to do these simultaneously or go through them one at a time.\n",
"\n",
"<a id='select'></a>\n",
"#### SELECT\n",
"Every query should start with `SELECT`. `SELECT` is followed by the names of the columns in the output.\n",
"\n",
"`SELECT` is always paired with `FROM`, and `FROM` identifies the table to retrieve data from.\n",
"\n",
"```sql\n",
"SELECT\n",
"<columns>\n",
"FROM\n",
"<table>\n",
"```\n",
"\n",
"`SELECT *` denotes returns *all* of the columns.\n",
"\n",
"Housing Data example:\n",
"```sql\n",
"SELECT\n",
"sqft, bdrms\n",
"FROM houses_pandas;\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demo: Write a query that returns the `sqft`, `bdrms` and `price`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"## Empty cell for code"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id='where'></a>\n",
"\n",
"#### WHERE\n",
"`WHERE` is used to filter table to a specific criteria and follows the `FROM` clause.\n",
"\n",
"```sql\n",
"SELECT\n",
"<columns>\n",
"FROM\n",
"<table>\n",
"WHERE\n",
"<condition>\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demo: Amend the previous SQL query to only output observations where the house has two bedrooms and the price is less than 250000"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"## Empty cell for code"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "7b842ff7-f8e7-404a-ac0b-164fcc756dd5"
},
"source": [
"<a id='aggregations'></a>\n",
"\n",
"### AGGREGATIONS\n",
"\n",
"Aggregations (or aggregate functions) are functions where the values of multiple rows are grouped together as input on certain criteria to form a single value of more significant meaning or measurement such as a set, a bag or a list.\n",
"\n",
"Examples of aggregate functions:\n",
"\n",
"- Average (i.e., arithmetic mean)\n",
"- Count\n",
"- Maximum\n",
"- Minimum\n",
"- Median\n",
"- Mode\n",
"- Sum\n",
"\n",
"In SQL they are performed in a `SELECT` statement as follows.\n",
"\n",
"```sql\n",
"SELECT COUNT(price)\n",
"FROM houses_pandas;\n",
"```\n",
"\n",
"```sql\n",
"SELECT AVG(sqft), MIN(price), MAX(price)\n",
"FROM houses_pandas\n",
"WHERE bdrms = 2;\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Demo:\n",
"\n",
"_Some activity where the students practice implementing different aggregation functions on a SQL table._"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"## Empty cell for code"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "f0124a1b-e821-423d-89f8-4427808ff3ad"
},
"source": [
"<a name=\"ind-practice\"></a>\n",
"## Activity: Querying a Database (20 minutes)\n",
"\n",
"Practice querying the SQLite database we've created using any of the methods you've learned so far:\n",
"\n",
"- console connection\n",
"- python `sqlite3` package\n",
"- pandas\n",
"\n",
"Practice querying the PostgreSQL database you can find at:\n",
"\n",
" url: dsi.c20gkj5cvu3l.us-east-1.rds.amazonaws.com\n",
" port: 5432\n",
" database: dsi\n",
" user: dsi_student\n",
" password: gastudents\n",
" \n",
"using:\n",
"- console connection\n",
"- python `sqlalchemy` package\n",
"- pandas\n",
"\n",
"\n",
"Questions:\n",
"1. What's the average price per room for 1 bedroom apartments?\n",
"1. What's the average price per room for 2 bedrooms apartments?\n",
"1. What's the most frequent apartment size (in terms of bedrooms)?\n",
"1. How many are there of that apartment kind?\n",
"1. What fraction of the total number are of that kind?\n",
"1. How old is the oldest 3 bedrooms apartment?\n",
"1. How old is the youngest apartment?\n",
"1. What's the average age for the whole dataset?\n",
"1. What's the average age for each bedroom size?\n",
"\n",
"Try to answer all these in SQL.\n",
"\n",
"If you finish, try completing the first sections of [SQL zoo](http://www.sqlzoo.net)."
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "1e0bd86b-2da5-482c-b372-aef67c3d105b"
},
"source": [
"<a name=\"conclusion\"></a>\n",
"## Conclusion\n",
"We have learned how to:\n",
"\n",
"- Connect to a local database\n",
"- Write data to a database\n",
"- Read data from a database\n",
"- Execute `SELECT` statements\n",
"- Use aggregation functions across multiple rows"
]
},
{
"cell_type": "markdown",
"metadata": {
"focus": false,
"id": "9ac7f29a-4436-49f3-abef-c840e8c43ece"
},
"source": [
"<a id='resources'></a>\n",
"\n",
"## ADDITIONAL RESOURCES\n",
"\n",
"- [SQLite3 Home](http://www.sqlite.org) \n",
"- [SQLite - Python Tutorial](http://sebastianraschka.com/Articles/2014_sqlite_in_python_tutorial.html) \n",
"- [SQL Zoo](http://www.sqlzoo.net)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 1
}