Is Skinny Data Harder Than Big Data?

Posted on May 24, 2013 I Written By

John Lynn is the Founder of the blog network which currently consists of 10 blogs containing over 8000 articles with John having written over 4000 of the articles himself. These EMR and Healthcare IT related articles have been viewed over 16 million times. John also manages Healthcare IT Central and Healthcare IT Today, the leading career Health IT job board and blog. John is co-founder of and John is highly involved in social media, and in addition to his blogs can also be found on Twitter: @techguy and @ehrandhit and LinkedIn.

On my post about Improving the Quality of EHR data for Healthcare Analytics, Glenn made a really great comment that I think is worth highlighting.

Power to change outcomes starts with liberating the data. Then transforming all that data into information and finally into knowledge. Ok – Sorry, that’s probably blindingly obvious. But skinny-data is a good metaphor because you don’t need to liberate ALL the data. And in fact the skinny metaphor covers what I refer to as the data becoming information part (filter out the noise). Selective liberation and combination into a skinny warehouse or skinny data platform is also manageable. And then build on top of that the analytics that release the knowledge to enable better outcomes. Now …if only all those behemoth mandated products would loosen up on their data controls…

His simple comment “filter out the noise” made me realize that skinny data might actually be much harder to do than big data. If you ask someone to just aggregate all the data, that is a generally pretty easy task. Once you start taking on the selection of data that really matters, it becomes much harder. This is likely why so many Enterprise Data Warehouses sit their basically idle. Knowing which data is useful, making sure it is collected in a useful way, and then putting that data to use is much harder than just aggregating all the data.

Dana Sellers commented on this in this Hospital EHR and Healthcare Analytics video interview I did (the whole video has some great insights). She said that data governance is going to be an important challenge going forward. Although she defined data governance as making sure that you’re collecting the data in a way that you know what that data really means and how it can be used in the future. That’s a powerful concept and one that most people haven’t dug into very much. They’re going to have to if they want to start using their data for good.