Summer is finally here and so are the long form virtual hackathons. Unlike a traditional hackathon, which focus on what you can build in one place in one limited time span, virtual hackathons typically give you a month or more to work from where ever you like.
And for those of us who love data, we are not left behind. There are a number of data science competitions to choose from this summer. Whether it’s a new Kaggle challenge (which are posted year round) or the data science component of Challenge Post’s Summer Jam Series, there are plenty of opportunities to spend the summer either sharpening or showing off your skills.
The Landscape: Which Competitions are Which?
Kaggle competitions have corporate sponsors that are looking for specific business questions answered with their sample data. In return, winners are rewarded handsomely, but you have to win first.
- Summer Jam
Challenge Post’s Summer Jam Open Data Mashup runs in June and focuses on mashing up multiple open data sets (use the Data Search Engine to find some great options). Competitors are not asked to answer a specific question, so this competition is well suited for beautiful experiments in visualizing data.
Like Kaggle, DrivenData competitions have a sponsor with a specific research question and specific sample data. DrivenData sponsors, however, tend to be more social impact minded.
Over the months we’ve posted many great links on winning data science competitions through our mailing list, but if you’ve missed them here’s a list of the best resources, advice and tutorials:
Choosing Your Weapons
DATA SCIENCE WARS: R VS. PYTHON
3 Must-Ask Questions Before Choosing That Machine Learning Algorithm!
Dictionary of Algorithms and Data Structures
Fast Non-Standard Data Structures for Python
A list of assorted tools and such mentioned and used During DSSG 2014
Data Science Resources
12 Best Free Ebooks for Machine Learning
Top 10 data mining algorithms in plain English
The Top Mistakes Developers Make When Using Python for Big Data Analytics
11 Python Libraries You Might Not Know
iPython Notebook Gallery (includes pandas cheat sheet)
D3.js Step by Step
For inspiration, check this index of visualization types for visualizing text
Gestalt Principles for Data Visualization
Advice From Past Competitors
Machine learning best practices we’ve learned from hundreds of competitions – Ben Hamner of Kaggle
LESSONS LEARNED FROM THE HUNT FOR PROHIBITED CONTENT ON KAGGLE
What I Learned From The Kaggle Criteo Data Science Odyssey
6 Tricks I Learned From The OTTO Kaggle Challenge
How to use R, H2O, and Domino for a Kaggle competition
Competing in a data science contest without reading the data
KAGGLE ENSEMBLING GUIDE