Article Text

Original research
Machine-learning algorithm to non-invasively detect diabetes and pre-diabetes from electrocardiogram
  1. Anoop R Kulkarni1,2,
  2. Ashwini A Patel2,
  3. Kanchan V Pipal2,
  4. Sujeet G Jaiswal2,
  5. Manisha T Jaisinghani2,
  6. Vidya Thulkar2,
  7. Lumbini Gajbhiye2,
  8. Preeti Gondane2,
  9. Archana B Patel2,
  10. Manju Mamtani2,3,
  11. Hemant Kulkarni2,3
  1. 1 Innotomy Consulting, Bengaluru, India
  2. 2 Lata Medical Research Foundation, Nagpur, India
  3. 3 M&H Research LLC, San Antonio, Texas, USA
  1. Correspondence to Dr Hemant Kulkarni, Lata Medical Research Foundation, Nagpur, India; hemant.kulkarni{at}mnhresearch.com

Abstract

Objectives Early detection is of crucial importance for prevention of type 2 diabetes and pre-diabetes. Diagnosis of these conditions relies on the oral glucose tolerance test and haemoglobin A1c estimation which are invasive and challenging for large-scale screening. We aimed to combine the non-invasive nature of ECG with the power of machine learning to detect diabetes and pre-diabetes.

Methods Data for this study come from Diabetes in Sindhi Families in Nagpur study of ethnically endogenous Sindhi population from central India. Final dataset included clinical data from 1262 individuals and 10 461 time-aligned heartbeats recorded digitally. The dataset was split into a training set, a validation set and independent test set (8892, 523 and 1046 beats, respectively). The ECG recordings were processed with median filtering, band-pass filtering and standard scaling. Minority oversampling was undertaken to balance the training dataset before initiation of training. Extreme gradient boosting (XGBoost) was used to train the classifier that used the signal-processed ECG as input and predicted the membership to ‘no diabetes’, pre-diabetes or type 2 diabetes classes (defined according to American Diabetes Association criteria).

Results Prevalence of type 2 diabetes and pre-diabetes was ~30% and ~14%, respectively. Training was smooth and quick (convergence achieved within 40 epochs). In the independent test set, the DiaBeats algorithm predicted the classes with 97.1% precision, 96.2% recall, 96.8% accuracy and 96.6% F1 score. The calibrated model had a low calibration error (0.06). The feature importance maps indicated that leads III, augmented Vector Left (aVL), V4, V5 and V6 were most contributory to the classification performance. The predictions matched the clinical expectations based on the biological mechanisms of cardiac involvement in diabetes.

Conclusions Machine-learning-based DiaBeats algorithm using ECG signal data accurately predicted diabetes-related classes. This algorithm can help in early detection of diabetes and pre-diabetes after robust validation in external datasets.

  • diabetes mellitus
  • early diagnosis
  • primary healthcare

Data availability statement

Data are available on reasonable request. The data are confidential and not publicly available. The codes and notebooks are available on reasonable request to the authors.

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Data availability statement

Data are available on reasonable request. The data are confidential and not publicly available. The codes and notebooks are available on reasonable request to the authors.

View Full Text

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • press release

Footnotes

  • Twitter @DrAnoopKulkarni, @dr_apatel

  • Contributors HK, ARK and MM conceptualised the study. AAP, KVP, SGJ and MTJ conducted the field study. VT, LG and PG recorded all the ECGs. HK and ARK conducted all the analyses. HK and MM wrote the initial draft of the manuscript. All the authors critically reviewed the manuscript and approved the final version. HK is the guarantor.

  • Funding This study was funded by Lata Medical Research Foundation’s internal funding mechanism.

  • Competing interests None declared.

  • Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.