On my post about Improving the Quality of EHR data for Healthcare Analytics, Glenn made a really great comment that I think is worth highlighting.
Power to change outcomes starts with liberating the data. Then transforming all that data into information and finally into knowledge. Ok – Sorry, that’s probably blindingly obvious. But skinny-data is a good metaphor because you don’t need to liberate ALL the data. And in fact the skinny metaphor covers what I refer to as the data becoming information part (filter out the noise). Selective liberation and combination into a skinny warehouse or skinny data platform is also manageable. And then build on top of that the analytics that release the knowledge to enable better outcomes. Now …if only all those behemoth mandated products would loosen up on their data controls…
His simple comment “filter out the noise” made me realize that skinny data might actually be much harder to do than big data. If you ask someone to just aggregate all the data, that is a generally pretty easy task. Once you start taking on the selection of data that really matters, it becomes much harder. This is likely why so many Enterprise Data Warehouses sit their basically idle. Knowing which data is useful, making sure it is collected in a useful way, and then putting that data to use is much harder than just aggregating all the data.
Dana Sellers commented on this in this Hospital EHR and Healthcare Analytics video interview I did (the whole video has some great insights). She said that data governance is going to be an important challenge going forward. Although she defined data governance as making sure that you’re collecting the data in a way that you know what that data really means and how it can be used in the future. That’s a powerful concept and one that most people haven’t dug into very much. They’re going to have to if they want to start using their data for good.
[…] the mean time, my colleague John Lynn suggests, it’s probably best to focus on “skinny data” – a big challenge in itself given how hard it can be to filter out data “noise” […]