You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
25 lines
898 B
25 lines
898 B
|
|
<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">
|
|
|
|
## Lab: EDA and Feature Engineering in Pandas
|
|
|
|
---
|
|
|
|
Welcome!
|
|
|
|
#### Pandas: Final Practice Problems
|
|
|
|
In this homework, you're going to write code for a few problems on two datasets:
|
|
|
|
* The [iris](https://www.kaggle.com/uciml/iris) dataset - a dataset of flowers whose species is classified by attributes of their flower sizes
|
|
* The [NCAA March Madness](https://www.kaggle.com/c/mens-machine-learning-competition-2018) dataset - a collection of ranks for teams in the March Madness sportsball competition.
|
|
|
|
You'll practice the following programming concepts we've covered in class:
|
|
* Basic EDA with Pandas.
|
|
* Using the `.apply()` method to create new feature columns and mutate existing columns
|
|
* Broadcasting, or implementing math transformations at column scale
|
|
* Dropping columns
|
|
* And much, much more!
|
|
|
|
|