You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
106 lines
3.5 KiB
106 lines
3.5 KiB
# Introduction to Data Science
|
|
|
|
## Lesson Objectives
|
|
|
|
1. Intros
|
|
1. What is Data Science?
|
|
|
|
## Intros
|
|
|
|
1. Here's a bit about me
|
|
1. This class can be about networking, too! Tell us about yourself!
|
|
- What is Your Name?
|
|
- What Brings You To GA?
|
|
- What Are Your Current Activities?
|
|
|
|
## What is Data Science?
|
|
|
|
What is it, exactly?
|
|
|
|
- A set of tools and techniques used to extract useful information from data.
|
|
- An interdisciplinary, problem-solving oriented subject
|
|
|
|
What does it consist of?
|
|
|
|
- Programming skills
|
|
- Math and Statistics knowledge
|
|
- Business sense
|
|
- Domain Knowledge
|
|
- Communication Skills
|
|
|
|

|
|
|
|
## Your Turn: Qualities Of A Data Scientist And You
|
|
|
|
Let's talk through the following questions in groups:
|
|
|
|
1. What do you think are the most important qualities for a data scientist?
|
|
2. Can you think of any other quality/skill we have not mentioned?
|
|
3. What is your field of expertise?
|
|
4. Do you use tools such as Excel, Stata, R, or Python?
|
|
5. Where are you in the intersection of these skills?
|
|
|
|
## Possible Answers: Qualities Of A Data Scientist And You
|
|
|
|
- Ask good questions:
|
|
- What is required?
|
|
- How are results evaluated? (measures of success)
|
|
- What do we currently know? (existing data)
|
|
- What has happened? (descriptive analytics)
|
|
- What will happen (if)? (predictive analytics)
|
|
- What to do to achieve what we require? (insight)
|
|
- Define and test a hypothesis/run experiments.
|
|
- Scrape, & sample business relevant data.
|
|
- Manipulate, sanitize, and wrangle data.
|
|
- Visualize data.
|
|
- Understand data relationships.
|
|
- Tell the machine how to learn from data.
|
|
- Create data products that deliver actionable insight.
|
|
- Tell relevant business stories from data.
|
|
|
|
## Self Assessment on Data Science Skills
|
|
|
|
For a given class size
|
|
- how many people will rate themselves strongest in Programming Skills?
|
|
- how many people will rate themselves strongest in Math and Statistics Knowledge?
|
|
- how many people will rate themselves strongest in Business Sense?
|
|
- how many people will rate themselves strongest in Domain Knowledge?
|
|
- how many people will rate themselves strongest in Communication Skills?
|
|
|
|
1. Create a table for the qualities of a data scientist and then rate yourself on each of these skills on a scale from 1-10.
|
|
1. We will then use the data to show how simple statistics in action are part of the data science workflow.
|
|
|
|
| Skill | Value |
|
|
| --- | --- |
|
|
| Programming Skills | |
|
|
| Math and Statistics Knowledge | |
|
|
| Business Sense | |
|
|
| Domain Knowledge | |
|
|
| Communication Skills | |
|
|
|
|
## The Data Science Workflow
|
|
|
|
1. Identify the problem
|
|
- what are we trying to do?
|
|
1. Acquire the data
|
|
- get data in its raw form
|
|
- scraping the data from a website
|
|
- downloading a file
|
|
- reading a book/article
|
|
1. Parse the data
|
|
- format the data so that it's all the same
|
|
1. Mine the data
|
|
- collect information from the data
|
|
1. Refine the data
|
|
- clean the data up
|
|
- discard outliers, etc
|
|
1. Build a data model
|
|
- figure out a formula that represents what we are trying to learn
|
|
1. Present the results
|
|
- visualize the results
|
|
1. Deploy and validate
|
|
- create a site
|
|
- publish findings
|
|
|
|

|