Guide to Data Science Competitions

“Don't worry about a thing,every littleSummer is finally here and so are the long form virtual hackathons. Unlike a traditional hackathon, which focus on what you can build in one place in one limited time span, virtual hackathons typically give you a month or more to work from where ever you like.

And for those of us who love data, we are not left behind. There are a number of data science competitions to choose from this summer. Whether it’s a new Kaggle challenge (which are posted year round) or the data science component of Challenge Post’s Summer Jam Series, there are plenty of opportunities to spend the summer either sharpening or showing off your skills.

The Landscape: Which Competitions are Which?

  • Kaggle
    Kaggle competitions have corporate sponsors that are looking for specific business questions answered with their sample data. In return, winners are rewarded handsomely, but you have to win first.
  • Summer Jam
    Challenge Post’s Summer Jam Open Data Mashup runs in June and focuses on mashing up multiple open data sets (use the Data Search Engine to find some great options). Competitors are not asked to answer a specific question, so this competition is well suited for beautiful experiments in visualizing data.
  • DrivenData
    Like Kaggle, DrivenData competitions have a sponsor with a specific research question and specific sample data. DrivenData sponsors, however, tend to be more social impact minded.

Over the months we’ve posted many great links on winning data science competitions through our mailing list, but if you’ve missed them here’s a list of the best resources, advice and tutorials:

Choosing Your Weapons
DATA SCIENCE WARS: R VS. PYTHON
http://101.datascience.community/2015/05/12/data-science-wars-r-vs-python/

3 Must-Ask Questions Before Choosing That Machine Learning Algorithm!
http://www.analyticbridge.com/profiles/blogs/wait-why-are-you-using-that-algorithm

Dictionary of Algorithms and Data Structures
http://xlinux.nist.gov/dads/

Fast Non-Standard Data Structures for Python
http://kmike.ru/python-data-structures/

A list of assorted tools and such mentioned and used During DSSG 2014
https://hackpad.com/A-list-of-assorted-tools-and-such-mentioned-and-used-During-DSSG-2014-wl5QgF3LsSU

Data Science Resources
https://github.com/jonathan-bower/DataScienceResources

12 Best Free Ebooks for Machine Learning
http://designimag.com/best-free-machine-learning-ebooks/

Top 10 data mining algorithms in plain English
http://rayli.net/blog/data/top-10-data-mining-algorithms-in-plain-english/

Python Shortcuts
The Top Mistakes Developers Make When Using Python for Big Data Analytics
https://www.airpair.com/python/posts/top-mistakes-python-big-data-analytics

11 Python Libraries You Might Not Know
http://blog.yhathq.com/posts/11-python-libraries-you-might-not-know.html

iPython Notebook Gallery (includes pandas cheat sheet)
http://nb.bianp.net/sort/views/

Visualizations
D3.js Step by Step
http://zeroviscosity.com/category/d3-js-step-by-step

For inspiration, check this index of visualization types for visualizing text
http://textvis.lnu.se/

Gestalt Principles for Data Visualization
http://emeeks.github.io/gestaltdataviz/section1.html

Advice From Past Competitors
Machine learning best practices we’ve learned from hundreds of competitions – Ben Hamner of Kaggle
https://www.youtube.com/watch?v=9Zag7uhjdYo

LESSONS LEARNED FROM THE HUNT FOR PROHIBITED CONTENT ON KAGGLE
http://mlwave.com/lessons-from-avito-prohibited-content-kaggle/

What I Learned From The Kaggle Criteo Data Science Odyssey
https://medium.com/@chris_bour/what-i-learned-from-the-kaggle-criteo-data-science-odyssey-b7d1ba980e6

6 Tricks I Learned From The OTTO Kaggle Challenge
https://medium.com/@chris_bour/6-tricks-i-learned-from-the-otto-kaggle-challenge-a9299378cd61

How to use R, H2O, and Domino for a Kaggle competition
http://blog.dominodatalab.com/using-r-h2o-and-domino-for-a-kaggle-competition/

Competing in a data science contest without reading the data
http://blog.mrtz.org/2015/03/09/competition.html

KAGGLE ENSEMBLING GUIDE
http://mlwave.com/kaggle-ensembling-guide/

4 thoughts on “Guide to Data Science Competitions

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s