So tomorrow we take Exversion for its first major test drive with hackers at The Alley’s Publishing Hackathon. While still in alpha and working out our fair share of bugs, we’ve preloaded a number of datasets that should be interesting to build on top of.
METADATA FOR SELECTED PERSEUS BOOKS GROUP TITLES
Organizing sponsor Perseus was originally planning on making this data available to hackers through an excel spreadsheet. We convinced them that an API would work a little better.
QUALITATIVE LITERARY ANALYSIS
Ages ago I built a platform filled with tools that tracked reader reaction and produced analytics to help writers workshop their writing. This is a dataset I used to build out some of those algorithms. It includes the readability scores, sentence structure, and word counts of many bestsellers.
BOOK CROSSING LOCATION BASED BOOK REVIEWS
Probably the most complicated of the datasets we cleaned and loaded. This is a scrape of Bookcrossings that tags book reviews with locations and ages of reviewers. Really interested in seeing if there are any trends in what books people tend to like in certain places.
AWARD WINNING BOOKS
Data on who made the “Best of” lists for 2011 and 2012. I was surprised by how many books on the list I had never heard of … which should make it good fodder for recommendations.
And lastly… for those of you with a more mischievous edge to your hacking. I managed to dig up two data sets on banned books:
BANNED AND CHALLENGED BOOKS 1990-2012
BOOKS BANNED IN TEXAS PRISONS BY THE TDCJ
Happy Hacking 🙂
P.S. For those of you who love to hack in rails, here’s a link to a ruby library on GitHub that will allow you to access our API.