Prediction of refractive error in adolescents using a multimodal large language model

Source: Frontiers Medicine

Original: https://www.frontiersin.org/articles/10.3389/fmed.2026.1770007...

Published: 2026-03-10T00:00:00Z

The study developed a multimodal large language model to predict refractive error (diopters) in adolescents aged 2 to 18 years using fundus images, clinical and demographic data. A model based on Qwen2.5-VL was trained on 16,226 annotated records to predict spherical equivalent (SE). Supervised learning and randomization of the data were used, with performance evaluated using mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R²) and Pearson's correlation coefficient. The model achieved a mean absolute error of 0.647 diopters and strong predictive performance in most refractive subgroups. Incorporation of multimodal data significantly outperformed unimodal models. Visualization analyzes confirmed the clinical interpretability and reliability of the predictions. The model can serve as a non-invasive tool for early myopia screening in both clinical and remote settings.