[1708.00253] Application of machine learning for hematological diagnosis

Log Date
TechniqueRandom forest classifier
Dataset Raw
Quick and accurate medical diagnosis is crucial for the successful treatment of a disease. Using machine learning algorithms, we have built two models to predict a hematologic disease, based on laboratory blood test results. In one predictive model, we used all available blood test parameters and in the other a reduced set, which is usually measured upon patient admittance. Both models produced good results, with a prediction accuracy of 0.88 and 0.86, when considering the list of five most probable diseases, and 0.59 and 0.57, when considering only the most probable disease. Models did not differ significantly from each other, which indicates that a reduced set of parameters contains a relevant fingerprint of a disease, expanding the utility of the model for general practitioner's use and indicating that there is more information in the blood test results than physicians recognize. In the clinical test we showed that the accuracy of our predictive models was on a par with the ability of hematology specialists. Our study is the first to show that a machine learning predictive model based on blood tests alone, can be successfully applied to predict hematologic diseases and could open up unprecedented possibilities in medical diagnosis.