EVALUATION OF MULTIPLE MACHINE LEARNING AND DATA SAMPLER INSTRUMENTS APPLIED TO COMPLEX ENTAMOEBA HISTOLYTICA/DISPAR/MOSHKOVSKII INFECTION EPIDEMIOLOGY STUDY DATA IN THE SMALL AND OUTERMOST ISLAND COMMUNITIES OF INDONESIA

Authors

  • Junaidi Junaidi Medical Laboratory Technology Study Program, Aisyiyah Polytechnic, Pontianak
  • Syibbran Mulaesyi Department of Informatics Engineering, Faculty of Engineering, Malikussaleh University, Lhokseumawe

Keywords:

Amebiasis, Entamoeba histolytica, Machine Learning, Risk Factors, Weh Island

Abstract

Amebiasis, caused by the parasitic protozoan Entamoeba histolytica, remains a global health issue requiring comprehensive epidemiological data. This study aimed to estimate prevalence, analyze risk factors, and identify the optimal Multiple Machine Learning (ML) model for predicting complex E. histolytica/dispar/moshkovskii infections in the Weh Island community. The epidemiological study applied four ML models and four data sampling methods, with model performance evaluated using standard metrics (AUROC, AUPRC, F1 score, accuracy). The results confirmed that the incidence of the complex amoeba infection was categorized as high. The DecisionTreeClassifier model combined with the TomekLinks sampling method yielded the best predictive performance. In conclusion, Amebiasis remains common in Indonesia. Hand washing behavior, and the source and adequacy of clean water correlated with infection incidence, though two observed negative correlations warrant further investigation. Given that morphologically identical non-pathogenic amoeba cannot be differentiated in this study, molecular-based identification methods are urgently needed.

Downloads

Published

2025-12-10