Routine blood biomarkers for the detection of multiple myeloma using machine learning

AbstractIntroduction

Primary laboratory tests performed in the diagnosis of multiple myeloma (MM) include bone marrow examination and free light chain assay; however, these may only be ordered after clinical suspicion of disease. In contrast, routine blood test results are readily available.

Methods

Machine learning algorithms (ML) combined with routine blood tests were used to detect MM. Feature selection was performed to achieve improved classification performance. The robustness of the classification models was assessed in an internal and external validation data set. To minimize the divergence, the training and validation data sets were combined and used to assess the performance of the ML algorithms.

Results

The AdaBoost-DecisionTable produced the best performance (accuracy =94.75%, sensitivity =87.70%, positive predictive value (PPV) =92.50%, F-measure =90.00%, and areas under the receiver operating characteristic curves (AUC) =97.50%) in the training data set using a 10-fold cross-validation. Performance in the validation data sets was affected by the divergence of the data sets, with accuracy greater than 85% and AUC greater than 90% in the validation data sets. The ML algorithm achieved a high accuracy of 92.61%, high AUC (96.80%), a sensitivity value of 85.20%, a PPV value of 88.50%, and an F-measure of 86.80% in a test set that was randomly selected from the combined data set.

Conclusions

Combining ML and routine serum biomarkers hold a potential benefit in MM diagnosis.