Detecting Healthcare Fraud using Machine Learning: Excluding Provider Labels for Improved Accuracy
Main Article Content
Abstract
As the elderly population continues to grow, it brings along with it a greater demand for medical services and increased healthcare expenses. To address these needs, the United States has a healthcare program called Medicare, designed to provide insurance primarily to individuals aged 65 and older, easing some of the financial burdens related to medical care. However, despite this initiative, healthcare costs remain high and continue to rise. One significant factor contributing to this problem is fraud. Our research paper tackles this issue head-on by employing machine learning techniques to identify fraudulent Medicare providers. We conducted a comprehensive study using publicly available Medicare data and fraud labels for provider exclusions. By building and evaluating three different learners, we sought to develop effective fraud detection methods. One challenge we faced was the class imbalance in the data, as there were very few actual fraud labels available. To address this, we adopted random under sampling, which allowed us to create four different class distributions. By doing so, we could mitigate the impact of the class imbalance and enhance the accuracy of our models. The results of our study were promising. Among the three learners we tested, the C4.5 decision tree and logistic regression stood out for their exceptional fraud detection performance. Particularly impressive was their performance with an 80:20 class distribution, where both models achieved average AUC scores of 0.883 and 0.882, respectively, along with low false negative rates. Our research demonstrates the effectiveness of using machine learning in combination with random under sampling to detect Medicare fraud. By identifying fraudulent providers more accurately, we can help reduce healthcare expenses and ensure that Medicare resources are utilized more efficiently, ultimately benefiting those in need of medical care.
Downloads
Metrics
Article Details
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.