The Year of Putting Data to Work

December 22, 2014 by Nate Williams

It’s been a banner year for the data ecosystem.

In terms of sheer data volume… well, frankly, it’s been off the charts:

More data has been created in just the last two years than in the entire previous history of the human race, according to the Scandinavian research group SINTEF. A quick search of the term “Big Data” yields a tangle of statistics, some as superlative as the term they attempt to define.

One statistic in particular caught my attention: By 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet, according to the annual IDC Digital Universe study. At that point, the world will be looking at digital knowledge in the neighborhood of 44 zettabytes, or 44 trillion gigabytes, up from just 4.4 zettabytes today. (While the sheer scale of this expanding universe is impressive, it’s worth recalling that we sent astronauts to the moon and back using computers with only 2 kilobytes of RAM.)

Individual consumers are creating the bulk of this new data and that’s great for the businesses that provide the platforms that rule these resources.

But the dizzying production of data isn’t limited to people breaking YouTube. It includes all kinds of important data generated from machines to software applications to street-level data collected from sensors on light poles.

Organizations are finding innovative uses for all this new data and putting data to work in completely new ways. For example, a single O’Reilly article from October lists over 25 interesting “big idea” projects, from cognitive augmentation to AI and deep learning to sensor/network convergence to data pipelines. Great stuff.

Colorful vector set of tree header design

However, as breathtaking as the growing volume of data and related innovation is, one of the things I’m most excited about is the wide range of new open data repositories that were made available on the Web this year.

Open data is a public resource that increases in value the more it’s used – a source of “positive externalities“. For example, open data generates significant social benefits in many ways, from better government transparency to new jobs and scientific breakthroughs. And it’s projected to deliver some pretty decent economic benefits as well. According to a McKinsey study, open data has the potential to add over $3 trillion annually in total value to the global economy (…yeah, that’s trillion, with a “T”).

Here are a few of the developments with open data that caught my eye in 2014:

Civic Data Projects

The growing openness of civic data certainly holds tremendous potential for social good. We’ve had the opportunity to plug into the Chicago open data community this year and it’s been an eye-popping experience. The folks at Open Gov Hack Night and other open data developers here produced many new fantastic apps, like, mRelief, and And with efforts by the Smart Chicago Collaborative, there’s even a community-driven “Civic User Test Group“ to support this work as well.

In addition to cities, state governments in the US are opening their data too, like with the beautiful Vermont Insights data portal that launched last month. So far, 10 states have created open data policies and 24 states now offer data portals.

Healthcare Data

Health data is fraught with promise and privacy peril. But by sharing data in a responsible way, there’s also great potential – from identifying the next flu outbreak to driving down unnecessary medical procedures. Recently, New York became first state in the US to make personal health data available to patients through a data portal. In addition, the California Department of Public Health has launched an open data portal aimed at addressing public health issues and improving public health.

Scientific & Research Data

Last but not least is the large volume of research data from scientific organizations and government agencies. As this wealth of data becomes a public resource, it enables scientists from around the globe to collaborate more easily and accelerates the pace of scientific discovery.

Some of the interesting data portals coming from the research community this past year include the CERN open data portal, a US data portal focusing on climate change, as well as the new USGS repository for sediment data, which is a nice addition to all the many open geographic data sets available.

. . .

If you like data, 2014 has been a great year. And, I’m certain it’s only gonna get better…