Mayo Developing Tools To Extract Medical Data From All EMRs

Posted on July 17, 2011 I Written By

Katherine Rourke is a healthcare journalist who has written about the industry for 30 years. Her work has appeared in all of the leading healthcare industry publications, and she's served as editor in chief of several healthcare B2B sites.

Here’s some interesting and potentially important news. According to some recent news items, it seems that Mayo Clinic investigators are putting the finishing touches on a suite of tools which can identify and sort medical data contained in any electronic medical record.

Mayo investigators are working under a federal grant, the $60 million Strategic Health IT Advanced Research Projects (SHARP) program, which is funded by the ONC.

According to a piece in Government HealthIT, the researchers have used natural language processing tools to isolate health data from about 30 digital medical records of patients with diabetes.  So far, so good. When the extracted data is run through specialized systems developed with IBM’s Watson Research Center, the 30 patient records “explode” into 134 *bilion* individual pieces of information, Government HealthIT reports.

Unfortunately, none of the sources I have explain what specific data pieces make up this total, which sounds extremely high to me. If we’re talking about just 30 patients, it’s hard for me to imagine that mundane details of care represent even multiple thousands of data points, unless you’re dealing with decades of care. (Perhaps the information involved includes the coding needed to extract the data — readers, can you clarify this for me perhaps?)

While I can’t testify as to how realistic the Mayo researchers’ claims are, I have to think that if they’re on target, something very big is in the works.  After all, to date I’ve heard little of tools that can effectively, fluidly extract clinical data from an entire EMR-based patient chart regardless of format or data organization. Concepts like natural language processing are far from new, but it seems they haven’t been up to the job.

Not only would  such capabilities allow virtually any set of institutions to share data, a giant leap in and of itself, they would also allow providers to do unprecedented levels of clinical analysis and ultimately improve care.

On the other hand, it’s not clear how practical this approach will be. If it only takes 30 records to generate that much data, just imagine how much data a single mid-sized hospital would have to wrangle!  If I’m reading things right, this technology may remain stuck at the research stage, as it’s hard to imagine most institutions could manage terabytes of new data.

Still, there’s clearly much to learn here. I’m eager to find out whether Mayo’s SHARP technology turns out to be usable in everyday clinical life.