Integrative Machine Learning of Genetic and Lifestyle Factors for Personalized Skin Health
Document Type
Article
Source of Publication
IEEE Journal of Translational Engineering in Health and Medicine
Publication Date
1-1-2026
Abstract
Objective: To develop an AI framework that combines genetic, phenotypic, and lifestyle data for profiling skin-health patterns and generating hypothesis-supporting summaries for potential decision support. Methods and procedures: A dataset of 5,254 individuals integrates six genes (FLG, AQP3, MMP-1, MMP-3, SOD2, GPX), six phenotype severities, and 20+ lifestyle factors. Mutation burden and interactions are tested by ANOVA. K-modes clustering identifies four interpretable dermatological profiles within the cohort and is embedded in leakage-free nested cross-validation (train-only selection; test labels from training centroids). Subtypes are predicted from genetics plus lifestyle using an XGBoost (XGB) classifier; explainability uses gain, permutation importance, and SHAP contributions aggregated across outer folds. Results: Four subtypes are identified. Mutation burden differed across phenotypes (ANOVA, p < 0.05). Interactions are observed for AQP3×Winter→Dryness, GPX×Medication→Pigmentation, and MMP-3×City Living→Redness. Nested-CV prediction achieves 0.9789 ± 0.0083 accuracy with macro-F1 0.9711 ± 0.0126 and macro-recall 0.9697 ± 0.0091. This outperformed unimodal baselines and improved generalization across all folds in practice. Drivers are stable across folds and included scrub usage, stress, sleep, low water intake, menopause, and camouflage habits, alongside oxidative-stress and MMP genes. Conclusion: Integrating genomic susceptibility with modifiable exposures enables robust, interpretable skinprofile prediction and highlights actionable targets for stratified counseling beyond genetic predisposition.
DOI Link
ISSN
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Volume
14
First Page
164
Last Page
178
Disciplines
Computer Sciences | Medicine and Health Sciences
Keywords
Dermatogenomics, gene-environment interactions, k-modes clustering, leakage-free evaluation, model interpretability, multimodal data integration, nested cross-validation, permutation importance, SHAP, XGBoost
Scopus ID
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Benachour, Yassine; Maloukh, Lina; and Geusens, Barbara, "Integrative Machine Learning of Genetic and Lifestyle Factors for Personalized Skin Health" (2026). All Works. 7978.
https://zuscholars.zu.ac.ae/works/7978
Indexed in Scopus
yes
Open Access
yes
Open Access Type
Gold: This publication is openly available in an open access journal/series