Growing up in the 80’s, I was pretty savvy with video games. Navigating Mario through a mushroom maze of minions to defeat Bowser and rescue Princess Peach was pretty straight forward. Once you got the patterns and timing down, it was pretty easy to beat the game. Fast forward thirty years later, I am lost trying to understand the Minecraft craze.
My kids watch Minecraft videos online, so when we got the game I thought it would be simple, straightforward like the games of my youth. However, when we turned it on, it was a free-play game with no real objective. There was no capture the flag and free the princess. I didn’t know where to start and I was pretty sure there was not an end, just an endless pixelated playland.
This is how many of us feel about big data. Our health organizations collect a seemingly endless amount of data, but much like Minecraft, where do you start to find meaning with all that information? IBM estimates that up to 80% of the world’s data is unstructured, meaning that it does not have some pre-defined data model. Patient comments and provider notes would fall into the category of unstructured data. The additional challenge in healthcare is that most health data is spread across multiple systems with no clear 360 degree view.
One of the goals for Minecraft is to mine the resources you need to survive the game. In healthcare, we rely on data mining to uncover the data we need for predictive modeling or other computer based algorithms. The problem with this kind of data mining is that the output is rule-based. The responses have to be contained neatly within pre-defined boxes for the computer to read and analyze the data. This analysis does not take into consideration unstructured data. Let’s look at a traditional patient feedback example.
A health organization measures patient satisfaction using a five question survey with an optional text field. The five questions all are answered on a scale of one to five. The optional text field is for patient comments. When a computer analyzes the five questions, it can perform an accurate analysis because the data is structured in a way that is easy for a computer to quantify. However, the optional text field is not measured and often times not able to be reviewed by an organization due to the volume and frequency of responses.
The problem with this strategy is that while the organization is getting feedback on the five areas that they want to measure, those five areas may not be indicative of the problem with the patient experience. The most important part of the survey to the patient is the comment box because that is where they believe they are communicating their issue directly to the organization. Rarely do these issues fit neatly within a pre-determined question set. Most often this unstructured feedback gets lost or not followed-up on because traditional computing cannot quantify unstructured data.
Machine learning is changing this. Machine learning has the ability to take unstructured data and classify it in a way to gain actionable insights that we would traditionally get from predictive modeling with structured data. For example, patient comments can be analyzed for sentiment. This means that an unstructured comment can be classified positive, negative or neutral making it easy for organizations to sort and take action on comments that need immediate attention. Drs. Zaid Obermeyer and Ezekiel Emanuel, both at Harvard Medical School, “likened machine learning in medicine to a doctor in residency who seeks to learn rules from data. In contrast, they compare computer-based algorithms to a medical student who applies general principles to new patients.”
Sentiment analysis is just the first step of being able to drive actionable insights from unstructured data. At Care Experience, we are collaborating with IBM Watson, to apply natural language processing to patient comments. By applying this these next generation analytics we can find recurring themes and trends that were not previously visible.