Learn Data Science the Hard Way
So you want to be a Data Scientist? The good news is that there are tons of great resources out there to learn from. The bad? None is comprehensive, and choosing the best can be completely overwhelming. I created this list to help you stay focused on learning what’s important, the easiest way possible.
But it won’t be easy…
Data Science combines Statistics, Programming, Machine Learning, and Visualization, amongst other disciplines. Simply put, there is a lot to learn. I took every course and read every book on this list, and it took me approximately 210 hours, over a few months.
Ready to dive in? Great! I would love to hear about your experience learning Data Science, and answer any questions. Tweet this post below and let me know how it’s going.
Finally, good luck, and have a lot of fun. I certainly did.
1. Immerse Yourself
We start with some light reading and listening. You can’t spend all your time reading textbooks and taking courses. Get these books and podcasts, and read or listen to them throughout your studies.
12 hr | $29
Read The Signal and the Noise by Nate SilverA fun introduction to Data Science, that will teach you how to think like a data scientist.
9 hr | $17
Read Naked Statistics by Charles WheelanAn easy introduction to statistics, without getting too deep into the maths.
free
Subscribe to the Data Skeptic podcastFeatures conversations with data science experts, as well as great mini episodes which teach the basics.
free
Subscribe to the Partially Derivative podcastA weekly discussion about Data Science related news.
free
Subscribe to the Data Science Weekly newsletterData Science news in your inbox, weekly.
2. Learn Python
Programming is a key part of Data Science. There’s an on-going debate about whether you should learn R or Python first. It’s better to pick one than spend your time debating the best. Choose Python.
6 hr | free
Do the Learning Python mission at DataQuestYou’ll learn Python interactively while playing with real data.
If you’re new to programming you may need a more thorough introduction. In that case:
40 hr | $30
Read Learn Python the Hard WayA great introduction to programming using Python.
Otherwise, you’ll pick it up quickly using:
1 hr | free
Read Learn Python in Y minutesThis is a really fast way to learn Python if you’re already a programmer
3. Learn the Big Picture
There are a lot of aspects to Data Science. In this unit you’ll focus on learning how they all fit together. Get a little breadth in your diet.
10 hr | $32
Read Data Science from ScratchThis is a fantastic book that introduces you to Data Science, using Python
5 hr | free
Take the Data Analysis and Data Visualization missions at Data QuestThese will teach you about numpy, pandas, and matplotlib, three crucial tools for your toolbelt.
4. Learn Statistics
Statistics is the foundation for much of Data Science. It is the tool we use to rigorously reason about the world using data.
7 hr | free
Take Udacity’s Intro to Descriptive Statistics courseThis course seems overly simplistic at times, but it’s a good refresher on descriptive statistics. Tip: Set the playback speed to 1.5x.
10 hr | free
Take Udacity’s Intro to Inferential Statistics courseThis course is also a little simple. It’s still worth going through to get a strong grip on hypothesis testing, which is critical in Data Science.
40 hr | $79
Read All of StatisticsIf you really want to master statistics, this is the book for you. Don’t get too bogged down with the details, but take a good read through it and use it as a reference for the rest of your career.
5. Learn Machine Learning
Machine Learning is a hot topic, and a big driver of the recent flood of interest in Data Science. It’s also a very deep field.
20 hr | free
Take Udacity’s Introduction to Machine Learning courseThis is a very practical, hands-on course. You learn how to apply machine learning algorithms using the sklearn Python package.
30 hr | free
Take Coursera’s Machine Learning courseThis is a more theoretically rigorous course. It is fantastically done.
6. Practice
Now use your skills and go out and do some actual data science!
8 hr | free
Complete a Kaggle competitionKaggle provides the data, you provide the science. Try some of their “Knowledge” competitions to get some practice.
12 hr | free
Do your own analysisFind a real dataset on Data.gov, perform a real analysis, and publish your findings online.