Aetna is taking on insurance fraud with machine learning

Data science and AI could give investigators the edge in the battle against healthcare fraud, waste, and abuse.

Data science is big business in the healthcare world. From radiology to risk management to precision medicine, if there’s structured data, you’re bound to find machine learning and other tools being deployed to scrutinize it.

While clinical applications get most of the headlines, a quiet revolution is taking place in the healthcare payor world. Vast seas of data, once handled by manual number-crunching and light automation, are increasingly being fed into machine learning systems for scrutiny.

Most of these initiatives are still in their infancy, little more than proofs of concept or research-focused operations. But a few healthcare companies are running headlong into applying data science to everyday business – and Aetna’s Analytics and Behavior Change organization might just be at the forefront.

Putting data science to work

Formerly known as Aetna Data Science, the Analytics and Behavior Change organization focuses on applying data science methods to business problems. It partners with different areas of the larger Aetna family to put machine learning and other tools to work.

One of those areas? Uncovering healthcare fraud, waste, and abuse.

“Two years ago, our analytics organization was approached by Aetna’s SIU (Special Investigation Unit),” explains Aleksandar Lazarevic, a senior director at Aetna’s analytics organization and the person heading up the machine learning fraud program.

SIU is responsible for identifying and investigating insurance fraud. Its investigators scrutinize the claims and billing process to uncover the tell-tale signs of overbilling, false coding, and other shady or wasteful practices. (And the cost is significant – it’s estimated that as much as 3% of the total US healthcare expenditure is lost to fraudulent or wasteful practices.)

Psst. Want to learn more about machine learning in healthcare?
Sign up for our free HealthWire newsletter.

Tracking down fraud means going through reams and reams of data, encompassing both claims and medical info. SIU previously used a vendor-designed analytics dashboard to pore through the millions of healthcare transactions in Aetna’s systems – but Lazarevic notes that the analytics solution was relatively unsophisticated, and SIU struggled to make good use of it.

Enter machine learning. Where the old dashboard could only spot fairly simple patterns of fraud, an intelligent system powered by machine learning models could be vastly more flexible – intuitive, even.

Building the tools to detect fraud

But before machine learning could be brought to bear, all that data needed to be collected and aggregated – a process Lazarevic refers to as “building a data pipeline.”

Due to the intrinsic complexity of healthcare data (something known in the data world as the “4Vs” – volume, variety, veracity, and velocity), this is one of the most time-consuming tasks in any health care analytics problem. “Despite all the challenges,” says Lazarevic, “we were able to bring all the data from many different sources across Aetna relatively quickly.”

Once the data was gathered, Lazarevic’s team began building a series of models. The initial ones were based on the signs of fraud SIU was already familiar with – providers submitting an unusually large number of claims for specific procedures, for instance, or seeing an abnormally large number of patients per day.

Of course, machine learning isn’t limited to simple pattern recognition – so the next step was deploying the supervised machine learning models to look for variations of known fraud patterns. These might be new types of fraud that are similar, but not identical to existing patterns – the sort of thing a traditional analytics dashboard would struggle to detect.

But it’s the third type of models that are the most fascinating – and the one which best illustrates the promise of machine learning. These were anomaly detection models, ones “designed to detect emerging or new types of fraud,” explains Lazarevic.

In layman’s terms, an anomaly detection model is a sort of AI-powered “hunch.” It might not be clear why a particular pattern is relevant, but the data stands out all the same. In the case of fraud, it might be some combination of practice type, billing data, and other medical data that the machine learning system thinks warrants further investigation.

With the models built, the final step in the process was incorporating the new processes into the fraud team’s workflow. “We didn’t build only the data pipeline – bringing all the data needed for fraud detection together on a daily basis,” says Lazarevic. “We were also able to build a tool that automatically sends all our findings to SIU.”

Lazarevic compares the findings to “leads,” individual anomalies or patterns that are sent to the SIU team to be assigned to an individual investigator. These leads also include a reason code that indicates why it was flagged by the machine learning system – something that’s essential for investigators tasked with following up and performing a full investigation.

“If you’re working with a fraud pattern you’re already familiar with, you know exactly what to look for,” explains Lazarevic. “But if you detect something completely new with analytical schemes, you don’t know where to start looking. You don’t know why this provider is suspicious, and providing a starting point is extremely useful.”

What the future could hold

What’s next for the machine learning program at Aetna? Lazarevic says that they’re hoping to make the fraud detection process more proactive, identifying patterns of fraud before they reach critical mass.

“In order to prove something is fraud, you need several months of investigation time,” explains Lazarevic. “During this time, you’re still receiving claims from this provider, you are still losing money.”

But that’s only an issue if the goal is officially proving that something is fraud. If the goal is business intelligence rather than formal proof of fraud, things can move much faster: removing offending providers from the network, for instance, or altering policies to reduce the number of fraudulent claims.

And that’s just the beginning. Outside the realm of fraud, Lazarevic notes that machine learning tools like the ones at Aetna could be used to identify sources of waste or sub-optimal medical practices, as well as spot areas where the provider network is lacking.

Whatever the future of Aetna’s program, it’s clear that machine learning will be more than just a flash in the pan in the world of healthcare. While the intricacies of data access and interoperability are still being ironed out, more and more resources are being devoted to machine learning and predictive systems across the larger industry, and the trend shows no sign of slowing.

Hopefully, that will ultimately mean better efficiency, reduced costs, and a better healthcare experience for everyone.