Psoriasis patients face a significantly elevated risk of cardiovascular diseases (CVD), necessitating early and accurate risk prediction tools. This study developed and validated a machine learning mo Show more
Psoriasis patients face a significantly elevated risk of cardiovascular diseases (CVD), necessitating early and accurate risk prediction tools. This study developed and validated a machine learning model to predict CVD risk in psoriasis patients using clinical and biochemical data from 2685 individuals. After preprocessing and addressing class imbalance with SMOTE-NC, six machine learning models (Logistic Regression as baseline, XGBoost, LightGBM, CatBoost, GradientBoosting, AdaBoost) were evaluated using a completely leak-free nested cross-validation framework (outer k = 10, inner k = 3) with randomized hyperparameter search (n_iter = 50). Feature selection via the Boruta algorithm was performed separately within each training fold to prevent data leakage. The Boruta algorithm identified 21 key predictors, including age, systolic blood pressure (SBP), apolipoprotein B (apoB), fasting blood glucose (FBG), and complement C1q. CatBoost emerged as the top-performing model (OOF ROC-AUC = 0.908, 95% CI [0.892-0.924]; PR-AUC = 0.509, 95% CI [0.448-0.578]; F1 = 0.540; MCC = 0.498; Brier = 0.078), while the Logistic Regression baseline achieved ROC-AUC = 0.909 but was eliminated due to poor calibration (Brier = 0.114 > 0.10). All metrics were evaluated with 95% bootstrap confidence intervals ( Show less