Mapping comorbidity patterns in high needs paediatric patients: A machine learning approach

No Thumbnail Available

Date

2025-10-08

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Background: Of children requiring hospital visits, High-cost, high-need (HCHN) children make up a small proportion of this paediatric population but account for a disproportionately large share of the total healthcare costs for the said population. Understanding comorbidity patterns within this subpopulation is crucial for improving care coordination. Unsupervised learning increasingly identifies patterns in complex healthcare data, offering new possibilities for patient stratification. Objective: This thesis used unsupervised machine learning techniques on administrative data from Västra Götalandsregionen (VGR) to identify distinct patient subgroups among high-cost paediatric patients in order to characterise their comorbidity patterns. Methods: The analysis involved hospital visit data for children aged 0-17 years in VGR, defining HCHN patients as the costliest 5%, accounting for 50% of total costs. Three clustering methods were used: KMedoids, Hierarchical Density-Based Spatial Clustering, and Agglomerative clustering to extract distinct patient groups based on diagnosis patterns. Results: The analysis revealed several distinct clusters amongst which were: (1) complex neonates likely with prolonged healthcare dependency remaining costly throughout childhood; (2) neonates initially high-cost due to early-term birth conditions but requiring minimal subsequent intervention; (3) teenage girls with mental health conditions that could lead to increased high self-harm rates; and (4) a complex ailment subgroup that require care from a multidisciplinary care team for optimum care. Conclusions: This study demonstrates unsupervised machine learning’s utility in identifying clinically meaningful subgroups within HCHN paediatric populations. Findings support the need to dismantle siloed treatment strategies through multidisciplinary care teams tailored to specific clusters. Lastly, identifying distinct clusters provides a foundation for targeted interventions and resource allocation strategies.

Description

Keywords

Unsupervised machine learning, data science, healthcare, comorbidity, data analysis, Kmedoids, HDBSCAN

Citation

Collections