Starter Pack for learning Data Science

Along these lines, you need to end up noticeably an information researcher or might be you are now one and need to extend your device vault. You have arrived at the opportune place. The point of this page is to give a complete learning way to individuals new to python for information examination. This way gives a thorough diagram of steps you have to figure out how to utilize Python for information investigation. On the off chance that you as of now have some foundation, or needn't bother with every one of the segments, don't hesitate to adjust your own ways and let us know how you rolled out improvements in the way. 

Step 0: Warming up

Before starting your journey, the first question to answer is:
Why use Python?
or
How would Python be useful?
Watch the first 30 minutes of this talk from Jeremy, Founder of DataRobot at PyCon 2014, Ukraine to get an idea of how useful Python could be.

Step 1: Setting up your machine

Now that you have decided, the time has come to set up your machine. The least demanding approach is to simply download Anaconda from Continuum.io . It comes bundled with a large portion of the things you will require ever. The real drawback of taking this course is you should sit tight for Continuum to refresh their bundles, notwithstanding when there may be a refresh accessible to the basic libraries. In the event that you are a starter, that should barely matter. 
In the event that you confront any difficulties in introducing, you can discover more definite directions for different OS here 

Step 2: Learn the basics of Python language

(3) You should start by understanding the basics of the language, libraries and data structure. The free interactive Python tutorial by DataCamp is one of the best places to start your journey. This 4 hour coding course focuses on how to get started with Python for data science and by the end you should be comfortable with the basic concepts of the language.
Specifically learn: Lists, Tuples, Dictionaries, List comprehensions, Dictionary comprehensions 
Alternate resources: If interactive coding is not your style of learning, you can also look at The Google Class for Python. It is a 2 day class series and also covers some of the parts discussed later.

Step 3: Learn Regular Expressions in Python

You will need to use them a lot for data cleansing, especially if you are working on text data. The best way to learn Regular expressions is to go through the Google class and keep this cheat sheet handy.
Assignment: Do the baby names exercise
If you still need more practice, follow this tutorial for text cleaning. It will challenge you on various steps involved in data wrangling.

Step 4: Learn Scientific libraries in Python – NumPy, SciPy, Matplotlib and Pandas

 This is the place fun starts! Here is a short prologue to different libraries. We should begin honing some basic operations. 

Practice the NumPy instructional exercise altogether, particularly NumPy arrays. This will shape a decent establishment for things to come. 
  • Next, look at the SciPy tutorials. Go through the introduction and the basics and do the remaining ones basis your needs.
  • If you guessed Matplotlib tutorials next, you are wrong! They are too comprehensive for our need here. Instead look at this ipython notebook till Line 68 (i.e. till animations)
  • Finally, let us look at Pandas. Pandas provide DataFrame functionality (like R) for Python. This is also where you should spend good time practicing. Pandas would become the most effective tool for all mid-size data analysis. Start with a short introduction, 10 minutes to pandas. Then move on to a more detailed tutorial on pandas.
  • Check out DataCamp’s course on Pandas Foundations
Additional Resources:
  • If you need a book on Pandas and NumPy, “Python for Data Analysis by Wes McKinney”
  • There are a lot of tutorials as part of Pandas documentation. You can have a look at them here
Assignment: Solve this assignment from CS109 course from Harvard.

Step 5: Learn Scikit-learn and Machine Learning

Now, we come to the meat of this entire process. Scikit-learn is the most useful library on python for machine learning. Here is a brief overview of the library. Go through lecture 10 to lecture 18 from CS109 course from Harvard. You will go through an overview of machine learning, Supervised learning algorithms like regressions, decision trees, ensemble modeling and non-supervised learning algorithms like clustering. Follow individual lectures with the assignments from those lectures.

Additional Resources:
Assignment: Try out this challenge on Kaggle

Step 7: Practice, practice and Practice!!

Congrats, you made it! 
You now have all what you require in specialized aptitudes. It involves rehearse and what preferred place to hone over contend with kindred Data Scientists on Kaggle. Go, plunge into one of the live competitions at present running on Kaggle and try all what you have learnt out! 

Step 8: Deep Learning 

Since you have learnt the vast majority of machine learning procedures, the time has come to give Deep Learning a shot. There is a decent possibility that you definitely comprehend what is Deep Learning, however in the event that regardless you require a concise introduction, here it is. 
The most complete asset is deeplearning.net. You will discover everything here – addresses, datasets, challenges, instructional exercises. You can likewise attempt the course from Geoff Hinton an attempt in an offered to comprehend the essentials of Neural Networks. 
Thankyou !
All the Best !!

Comments

Popular posts from this blog

Login into Gmail Account Using Web Driver

Tutorial 4

Case study of Library Management System