Data is The Colonial State of Tech

When we started Exversion we wrote a list of all the problems we had dealt with working with data. Top of the list was actually finding the data we needed for a given project.

Like most people when we need something online the first stop is Google, but Google’s algorithms are designed to deliver content. Their spam prevention techniques often penalize data repositories because their listings don’t contain enough text to be seen as valuable and worthwhile content.

This interpretation of value as a matter of prose also penalizes stock image sites and file repositories. For that sort of stuff you have to know where to go before you can even touch a search box.

Stupid Ideas We Had: Let’s Build a Search Engine
We knew if we could find a way to solve this problem for others we would have one foot in the door with the rest of what we were building.

When you’re working on a startup there are certain obvious pitfalls that everyone knows they should avoid and yet nevertheless almost everyone falls into. It was important that whatever we built complimented our product vision. We didn’t want our efforts at improving accessibility taking away from what we were passionate about building (duh), but our first approach here was stupid because we underestimated how much work it is to build a real search engine.

That seems even dumber in retrospect, but we truly believed that with all the fancy new open source tools (like ElasticSearch) search would be simple.

It wasn’t simple and after a month or so I scrapped the project to go back to what we were passionate about.

Total Data Request Live
Two months ago I was sitting in a client meeting listening to a consultant go on and on about how people try to search for data, when it hit me. What if instead of a search engine, we built more of a StackOverflow system where people could request the data they were looking for and have it fulfilled by the community?

We had been working on Exversion long enough by that point to realize that a major problem for us was how heavily fragmented the data community really is. To be honest, there is no such thing as “the data community”. It is the colonial state of tech: several tribes with no common language, process, or experience roughly fenced in by an arbitrary border.

Building tools for people who work with data is different from building tools for people who write code. There are slight differences in culture across generations, races and genders in the coding community, but no where near the variance of the so called “data community”

Rather early on an advisor looked at this problem and aptly summed it up: “It doesn’t matter how great your technology is, you’re going to have to figure out a way to pull the community together first”

Easier said than done. The typical advice for building community comes down to a handful of limp “cross-your-fingers” solutions: comments, gamification, social sharing. We needed something better.

This week we launched the first step in that direction by adding Data Requests. It’s a simple system: you post what data you need, what sources you trust, and what you want to use the data for. People can comment on a request, push the request up by saying they need the same data, or submit an Exversion repository for review. Then the requester can select the best answer from the list and close the request.

We Make Data Sexy
When I opened up the feature to my friends a few days ago I ended up with a couple of emails from people who do not identify as part of the “data community” or even play a technical role. They had long wish lists of data they were trying to find for various reasons and were excited by the idea that maybe there might be a place they could go to just to get some guidance on where to even START looking.

So we’re excited about launching this officially and working on promoting it as much as possible. While the differences in experience, perspective, process, and technical literacy that keep the data community for actually being a community may be a disadvantage for us, it’s also an incredible opportunity. What kind of innovation might happen if all these different parts actually worked together?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s