Links
First things first,
Data Science
-
Top 10 DS courses # http://bigdata-madesimple.com/review-of-top-10-online-data-science-courses/
-
https://www.kaggle.com/wiki/Tutorials
-
https://github.com/ujjwalkarn/Machine-Learning-Tutorials
-
http://www.kdnuggets.com/2015/06/top-20-python-machine-learning-open-source-projects.html
-
http://brettromero.com/wordpress/data-science-a-kaggle-walkthrough-introduction/ Kaggle
-
https://github.com/jmschrei/pomegranate Pomegrante
Git
- Git: How to set up remote git branch # http://www.gitguys.com/topics/adding-and-removing-remote-branches/
Python
Statistics
Quick Short Cuts
Ipython Notes for learning
Lots of quick & interesting slides
-
https://speakerdeck.com/jakevdp Statistics 4 Hackers
-
https://www.youtube.com/watch?v=nCPf8zDJ0d0 Introduction of Deep Learning.
Data Scientist Workbench:
It’s a free all-in-one solution for people interested in performing data analysis. The Data Scientist Workbench includes:
-
OpenRefine to clean up messy data.
-
Jupyter notebooks supporting Python, R, and Scala (with access to Apache Spark for Big Data processing).
-
Apache Zeppelin notebooks.
-
RStudio in your browser.
https://my.datascientistworkbench.com/
QuickSlides on NLTP - Natural Language Text Processing
- https://www.cse.iitb.ac.in/~neelamadhavg09/docs/dependency_parsing.pdf # Articles on semantic text-parsing - dependency parsing.
Kaggle Tips:
Related reading:
Part 1 of this blog post series: Orientation
Part 2b: Ranking and regression metrics
Part 3: Validation and offline testing
Part 4: Hyperparameter tuning
Part 5: A/B testing
Tom Fawcett’s 2006 Pattern Recognition Letters paper on An Introduction to ROC Analysis.
Chapter 7 of Data Science for Business discusses the use of Expected Value as a useful classification metric, especially in cases of skewed data sets.
Research Articles
-
A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors
-
https://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
Note: This post was updated on April 16, 2015. Thanks to @aatallah for demystifying the origin of the name “ROC curve,” and to Joe McCarthy for the helpful references.
PDF/Slides Generator for presentations
-
http://www.tug.org/mactex/ # This is mac version of LaTeX. There is an extension called Beamer to try.