Veda solutions

For unsupervised machine learning

0
0
0
unsupervised machine learning

Perfecting Provider Directory AI Modeling

Q&A with Bob Lindner on why sustainably-fed AI models are the path forward

As an AI company powered by our proprietary data training AI models, the article, “When A.I.’s Output Is a Threat to A.I. Itself,” in the New York Times caught our eye. Illustrating exactly what happens when you make a copy of a copy, the article lays out the problems that arise when AI-created inputs generate AI-created outputs and repeat…and repeat.

Veda focuses on having the right sources and the right training data to solve provider data challenges. A data processing system is only as good as the data it’s trained on; if the training data becomes stale—or, is a copy of a copy—inaccurate outputs will likely result.

We asked Veda’s Chief Science & Technology Officer, Bob Lindner, PhD, for his thoughts on AI-model training, AI inputs, and what happens if you rely too heavily on one source.

At Veda, we use what we call “sustainably-fed models.” This means we use hundreds of thousands of input sources to feed our provider directory models. However, there is one kind of source we don’t use: payer-provided directories.

Provider directories are made by health plans that are spending millions of dollars of effort to make them. By lifting that data directly into Veda’s AI learning model, we would permanently depend on ongoing spending from the payers. 

We aim to build accurate provider directories that allow the payers to stop expensive administrative efforts. A system that depends on payer-collected data isn’t useful in the long term as that data will go away.

The models will begin ingesting data that was generated by models and you will experience quality decay just like the New York Times article describes.
We use sustainably sourced inputs that won’t be contaminated or affected by the model outputs.

Veda does the work and collects first party sources that stand independently without requiring the payer directories as inputs.

Beyond the data integrity problems, if you are using payers’ directories to power directory cleaning for other payers, you are effectively lifting the hard work from payer 1 and using it to help payer 2, potentially running into data sharing agreement problems. This is another risk of cavalier machine learning applications—unauthorized use of the data powering them.

Imagine we make chocolate and we are telling Hershey that they should just sell our chocolate because it’s way better than their own. We tell them, “You could save a lot of money by not making it yourselves anymore.”

However, we make our chocolate by buying a ton of Hershey’s chocolate, remelting it with some new ingredients, and casting it into a different shape.

In the beginning, everything is fine. Hershey loves the new bar and they’re saving money because we’re doing the manufacturing. Eventually, they turn off their own production. Now, with the production turned off, we can’t make our chocolate either. The model falls apart and in the end, no one has any chocolate. A real recipe for disaster.

AI for Amateurs: Questions Answered by Veda’s AI Experts

Everything You Always Wanted to Know About AI But Were Afraid to Ask


Impossible to miss, 2023 is synonymous with the year AI debuted to the masses. AI capabilities have brought up questions in every industry, including healthcare. Your organization will likely find itself navigating the risks and rewards associated with healthcare AI in the coming year.

But, let’s start with a question you’re too afraid to ask at the company meeting: What is AI? Like, really. We’ve found a lot of false information out there and we’re here as a trustworthy source you can pull information from.

Why is Veda a Trusted Source?

As pioneers who have used AI technology since our founding, we’re passionate advocates for AI and want to ensure everyone else feels comfortable with it too.

Want our credentials? Our technology and data science team has 80 years of collective AI experience. Veda co-founder and Chief Science and Technology Officer, Bob Lindner, is the author of five technology patents on AI, entity resolution, and machine learning. Bob also has over 16 years of experience writing and publishing scientific and academic papers in the artificial intelligence field.

Backed by extensive experience and science, we’re the AI experts.

What is AI?

OFFICIAL ANSWER: Artificial intelligence Is a field of study that focuses on how machines can solve complex problems that usually involve human intelligence.

AI is not one specific tool. It is a field of study. With AI’s computing power, computers can make decisions and predictions, and take actions. An algorithm recommending which movie you should watch next is an AI action.

VEDA’S TAKE: So why does this matter? Why is AI important? By freeing up human resources, AI can reduce manual and often error-prone tasks. Freeing up people so they have the time to do the things they do best, that’s the power of AI.

What is machine learning?

OFFICIAL ANSWER: Machine learning is a sub-field of AI and focuses on algorithms that train models to make predictions on new data without being explicitly programmed. Meaning, the machine learns the way humans do, with experience.

Note: In recent years, some organizations have begun using the terms artificial intelligence and machine learning interchangeably.

Instead of learning step by step, computers using machine learning can learn through trial and error and lots of practice. What does machine learning practice on? Lots and lots of data. The data can be things like images, video, audio, and text. When fed loads of data, machine learning will recognize patterns and make predictions based on these patterns.

AI is not one specific tool. It is a field of study. With AI’s computing power, computers can make decisions and predictions, and take actions.

VEDA’S TAKE: Veda uses machine learning, and therefore, AI for the healthcare industry, every day. For what exactly? To power our provider information. Veda uses machine learning to:

  • Determine correct addresses and phone numbers
  • Transform provider rosters from one format to another
  • Simulate an experience a member may have when booking an appointment

With a patented training data approach, our machine learning can make predictions on a wide variety of new data (that it has never seen before in the training set).

Feeling good about AI and machine learning? Further your AI understanding with these blogs:

Supervised vs. Unsupervised Machine Learning: Using The Right Tool For The Job

It’s Veda’s philosophy that any new technology tools we utilize are not meant to wholly replace human engagement. We believe that technology should help elevate humanity. With a focus on performing meaningful work, people can achieve their highest value. Technology should help people help people.

As a Data Science as a Service (DSaaS) company, Veda leverages scientific principles within our unique AI and machine learning systems to perpetually clean, correct, and monitor evolving provider and facility data. A lot of companies claim to offer accurate provider data, but none are as committed to using science to solve deeply entrenched provider data problems.

Dr. Bob Lindner describes supervised vs. semi-supervised learning

At Veda, we use AI systems including natural language processing (NLP), supervised, and unsupervised learning components that can be leveraged to solve a wide array of payer data challenges depending on what tool is right for the job. No matter what, we start by understanding the problem—not by applying a method.

A lot of companies claim to offer accurate provider data, but none are as committed to using science to solve deeply entrenched provider data problems.

The Roles of Supervised and Unsupervised Learning

Supervised Learning

Veda utilizes supervised learning because it doesn’t require “perfect” sources of data— it can make use of the good parts of any data source and knows how to ignore errors. With supervised learning, a data scientist is watching and helping train a model with all the healthcare nuances and industry-specific language, etc. to make the model a good one.

We’ve measured individual sources of data for years—including attestation— and haven’t found any at 90% or above yet. Supervised learning is incredibly accurate with the data we have access to today. The benefit is the highest accuracy which is flexible and not dependent on a single data source.

Unsupervised Learning

We also use unsupervised learning for offline data exploration and research to learn more about a dataset, to help design better machine learning features for supervised learning systems. That’s because unsupervised learning separates big collections of data into groups on its own. The benefit of unsupervised learning is its ability to pick out patterns in the data.

Unsupervised learning separates big collections of data into groups on its own. Some will claim that unsupervised learning, alone, is superior to supervised learning because of the lack of human intervention. However, most algorithms require the user to specify upfront how many groups they want the data separated into. So no matter what data is being grouped, one would have to delineate exactly how many groups are wanted ahead of time and the exact number of groups the data is sorted into regardless of whether the groups match up well with the data. This requires the user to have upfront knowledge of exactly what labels and groups they need.

But the biggest pitfall of unsupervised learning is that there’s no labeled training data, which means there’s no actual measurement of how well it’s working and placing the items into the correct groups. And with no way of knowing how well it’s working, it’s impossible to depend on unsupervised learning as a primary method for accuracy.

The Right Tool For the Job

Supervised and unsupervised learning are tools, and just as you wouldn’t remodel your kitchen and only use a saw, you shouldn’t only use one kind of machine learning model.

Veda’s technology and approach to data challenges are fundamentally different from other provider data technology companies in that we focus on fully automating both the static information and the more challenging temporal information about a provider—data that changes at varied rates over time, like practice address, phone, and group affiliation. Our patented systems do not require manual outreach to providers, rather they rely on data created by providers throughout their established workflows. This increases data accuracy by reducing human error while also decreasing provider abrasion. Validating millions of temporal data elements in real-time requires Veda’s full automation system and could not be solved manually.

Above all, we believe that AI and machine learning are the best ways to solve the provider data quality problem because:

  • These techniques do not require us to know how accurate our sources of data are ahead of time—the machine figures out how to tell good data from bad
  • AI makes the most of imperfect and changing data
  • They do not require provider participation—we use data they already create in their day-to-day workflows, so no need to persuade providers to take additional action
  • It works—we have scientifically tested attestation along with “source of truth” modeling, and Veda’s approach has the highest measured accuracy of any approach in the industry.

Read more about Veda’s approach to AI and data science with Dr. Bob Lindner’s blog post, Artificial Intelligence, ChatGPT, and the Relationship Between Humans and Machines.

You know your business. We know data.

One Simplified Platform

Veda’s provider data solutions help healthcare organizations reduce manual work, meet compliance requirements, and improve member experience through accurate provider directories. Select your path to accurate data.

Velocity
ROSTER AUTOMATION

Standardize and verify unstructured data with unprecedented speed and accuracy.

Vectyr
PROFILE
SEARCH

The most up-to-date, comprehensive, and accurate data source of healthcare providers, groups, and facilities on the market.

Quantym
DIRECTORY ANALYSIS

Review and refresh your network directory to identify areas that affect your quality metrics.

Resources & Insights

clock on wall plants on shelf
New in 2025: CMS Standards for Initial Appointment Wait Times
January 15, 2025
pulse 2.0 logo
Pulse 2.0 Interview With CEO & Co-Founder Meghan Gaffney About The Healthcare Innovation Company
January 6, 2025
Provider Data Solution Veda Automates Over 59 Million Hours of Administrative Healthcare Tasks Since 2019
October 21, 2024
Contact Veda Today

Take your healthcare data to the next level

Let’s transform your healthcare data. Contact Veda to learn how our solutions can help your organization improve efficiency and data accuracy.