Development of a Nomogram-Based Tool to Predict Neurocognitive Impairment Among HIV-positive Charter Participants

Among 1,307 participants, 21.6% were neurocognitively impaired. During the MB analysis, age provided the highest amount of mutual information (0.0333). Logistic regression also showed that old age (>50 vs. ≤50 years) had the strongest association (OR=2.77, 95% CI=1.99-3.85) with NCI. The highest possible points on the nomogram were 626, translated to a nomogram-predicted probability of NCI to be approximately 0.95. The receiver operating characteristic (ROC) curve's concordance index was 0.75, and the nomogram's calibration plot exhibited an excellent agreement between observed and predicted probabilities.


INTRODUCTION
The development of highly active combination antiretroviral therapy (cART) has resulted in a remarkable decline in HIV-associated morbidity and mortality [1]. HIV has neurocognitive impairment (NCI) [4,5]. Even in milder forms, NCI is disabling and constitutes a public health issue due to its adverse effect on everyday functioning. NCI has been found to be associated with poor medication management [6], low selfefficacy for healthcare interactions [7], unemployment [8], poor health-related quality of life [9], and higher mortality [10].
The pathogenesis of NCI in people living with HIV (PLWH) is multifaceted, including direct viral replication, chronic inflammation, treatment-related adverse effects, comorbidities, and aging, and is not well recognized [11]. A number of risk factor associations with NCI have been studied in the HIV-infected population. However, the studies have generated inconsistent results. While some studies reported older age, female gender, Hispanic ethnicity, substance use, comorbidities (depression, hepatitis-C co-infection, metabolic disorders, and anemia), high viral load, and low CD4 T-cell counts to be positively associated with NCI in PLWH [12 -21], others reported contradicting results [22 -28]. The findings' inconsistencies may be due to methodological differences, including differences in study population and eligibility criteria, small sample size, differences in testing modalities, and a variable degree of control for confounding factors. There is still a need for a comprehensive assessment of etiological factors to develop an optimal prediction model for NCI in PLWH.
Neuropsychologic testing remains the "gold standard" of NCI diagnosis; however, it is time consuming, costly, and requires interpretation by a neuropsychologist. Thus, a brief and readily available screening tool is desirable in clinical practice. The current screening tools such as CogState battery, revised HIV dementia scale, and Montreal Cognitive Assessment are either robust in detecting more severe forms of impairment (e.g., HIV-associated dementia) [29] or did not have good screening accuracy for HAND [30]. Furthermore, these screening tools require neurocognitive testing for implementation. Given that demographic and clinical data can be readily collected, they may be valuable in developing userfriendly nomograms to predict NCI in PLWH. Nomograms are a user-friendly pictorial representation of complex statistical models that estimate individualized risk probabilities of particular events based on patient and disease characteristics. Nomograms have been widely studied and used in cancer research [31]. However, to our knowledge, they have not been utilized for neuropsychological assessments. Nomograms could potentially serve as adjunct procedures when determining whether a patient may require further neuropsychological assessments. Our study's first objective was to identify factors associated with NCI using a large and diverse sample of PLWH. The second objective of our study was to build a userfriendly predictive tool (nomogram) based on demographic, clinical, and behavioral variables to identify PLWH at risk of NCI.

Data Source and Participants
The CNS HIV Antiretroviral Therapy Effects Research (CHARTER) study baseline database was used to examine risk factors associated with NCI in HIV patients and to develop the nomogram. The CHARTER study was a prospective observational study conducted with the primary aim to determine how central and

Variables
The primary outcome was neurocognitive impairment (NCI). NCI was ascertained at enrollment through a comprehensive neurocognitive test battery covering seven domains, abstraction/executive functioning, speed of information processing, attention and working memory, learning, memory, verbal fluency, and motor functioning. Individual raw test scores were converted to demographically corrected standard scores (T-scores), which were then averaged to generate the global T-score. The best available normative standards were used to correct the effects of age, education, sex, and ethnicity, as appropriate. Details of the normative standards are given elsewhere [34]. An impaired neurocognitive status was assigned to those with a global Tscore value of <40 [35 -37]. We did not further classify NCI into mild, moderate, or severe because this categorization is partially based on activities of daily living (ADLs), and ADLs were already included as predictors in the model. Based on prior literature and biological plausibility, the independent variables included in the analyses were demographic factors (age, education, gender, race, ethnicity and employment), HIV-related factors (disease severity, duration of HIV infection, antiretroviral (ARV) drug use, CD4 nadir, current CD4 and plasma viral loads), activities of daily living (eating, bathing, dressing, using the toilet and taking medication), comorbidities (depressive symptoms assessed through Beck depression index (BDI-II), anemia, syphilis, history of head injury, history of coma, history of seizures, diabetes, hypertension using ICD-9 codes, hyperlipidemia, hypercholesterolemia, viral hepatitis and any AIDS-defining comorbidity), laboratory measures (urine proteins, hepatitis C viral loads, serum glucose, blood urea nitrogen, serum creatinine, serum sodium, serum chloride, serum potassium, serum calcium, serum total protein, serum bilirubin, serum aspartate aminotransferase (AST), serum alanine aminotransferase (ALT), serum hemoglobin, blood monocyte percent), medication history (antidiabetics, lipid lowering drugs, psychotropic medication) and substance use (history of alcohol, opiate, hallucinogen, inhalant, sedative, methamphetamine, cannabis and cocaine use). Among the activities of daily living, selected variables (eating, bathing, dressing, using the toilet, and taking medication) were included as they were universally applicable to all the participants. The severity of HIV infection was measured using the 1993 Centers for Disease Control and Prevention (CDC) classification system. The variable "any AIDS-defining comorbidity" was categorized "yes" if the participant had a diagnosis of any of Cryptococcus (extrapulmonary), cytomegalovirus disease (other than liver, spleen, or nodes), cytomegalovirus retinitis, HIV-related encephalopathy, herpes simplex, disseminated histoplasmosis, Kaposi sarcoma, Burkitt's lymphoma, disseminated mycobacterium avium complex, any site mycobacterium tuberculosis, pneumocystis carinii pneumonia, recurrent pneumonia or progressive multifocal leukoencephalopathy. No exclusions were made based on any comorbidity to enable the nomogram to detect any neurocognitive impairment (HIV-related and others).

Statistical Analyses
Descriptive statistics were performed for categorical (frequencies and percentages) and numeric variables (means, medians, and standard deviations) to assess the sample's overall demographic and clinical characteristics. All continuous clinical variables (duration of HIV infection, CD4 nadir, current CD4, plasma viral loads, depressive symptoms assessed through BDI-II, urine proteins, hepatitis C viral loads, serum glucose, blood urea nitrogen, serum creatinine, serum sodium, serum chloride, serum potassium, serum calcium, serum total protein, serum bilirubin, serum AST, serum ALT, serum hemoglobin and blood monocyte percent) were converted to meaningful categorical variables using standard clinical cutoffs taken from MedlinePlus [38] (Table S1). A cutoff value of 50 (≤50 years or >50 years) was used to categorize age as a binary variable as nearly half of PLWH in the US are aged 50 years or older [39]. The duration of HIV infection was categorized as ≤15 years or >15 years based on the median. Chi-squared tests were performed to determine the outcome variable's association with individual predictors. A Bayesian network analysis using a supervised learning technique with the Markov Blanket (MB) algorithm was employed for predictive modeling. Bayesian networks are non-parametric probabilistic models that identify predictors qualitatively through a graphical diagram with nodes (representing variables) and edges (arrows representing relationships) [40]. Bayesian network analyses do not require conventional statistical assumptions, and they can handle a large number of predictors [40,41]. Furthermore, compared to standard regression models in which the correlation between the variables leads to multicollinearity and lack of robustness of model fitting, Bayesian networks leverage the mutual correlation between variables to define the conditional probability distributions [42]. The MB is the smallest subset of the Bayesian network characterized by the property that all variables outside the MB could be deleted without influence on the target node and thus will have no impact on the accuracy of classification [43]. Mutual information was generated between the nodes (independent variables) and the target node (outcome variable).
K-fold (10-fold) cross-validation was used to evaluate network performance. Given the broad age range, a sub-analysis using Bayesian Network was conducted and an MB model was generated separately for participants less than 50 years old and 50 years or older to investigate age differences.
With NCI as the outcome variable (binary), multivariable logistic regression analysis was conducted to obtain the adjusted regression coefficients for the independent variables identified through the Bayesian network analysis. All nine variables identified by the Bayesian network analyses were included in the multivariable logistic model. To these, we added one demographic variable (race/ethnicity) that had dropped out in the network analysis but was associated with the outcome at 2-sided α=0.1 in univariable logistic regression. A nomogram was built using techniques described by Lasonos et al., Brittain E,. Briefly, the regression coefficients from the multivariable logistic regression were converted into scores ranging from 0 to 100 points. The variable with the largest absolute regression coefficient was assigned 100 points. The remaining variables were assigned a smaller number of points proportional to the value of their regression coefficient. The points from all the variables were added together, and the total score was then converted into the corresponding probability of having NCI. The nomogram's predictive accuracy (discrimination) was measured via a concordance index (c-index). The nomogram's calibration was assessed by reviewing the plot of predicted probabilities from the nomogram versus the actual probabilities. The analyses were performed using SAS 9.4 (SAS Institute, Inc., Cary, NC) software and R software-3.5.0 (RMS package).

RESULTS
A total of 1610 participants were enrolled in the CHARTER study; of these, 1,595 participants (99.1%) had complete baseline data on the outcome variable. After excluding 288 participants (18.0%) with missing information on covariates of interest, the final sample analyzed included 1,307 participants. Most participants were under 50 years of age (81.6%), males (77.4%), and non-Hispanic African Americans (47.4%) (Table S1). At baseline, 21.6% of participants had global T-scores under 40 (impaired), and 65.1% were using highly active antiretroviral therapy (HAART). Furthermore, among those who were older than 50 years old, 41.2% were neurocognitively impaired compared to 17.2% of those at or below 50 years of age ( Table S1).
The Bayesian network analysis identified that neurocognitive impairment has a direct probabilistic relationship, in descending order of mutual information, with age, lifetime cocaine use, current employment status, difficulty in bathing, dressing, eating, or using the toilet, impaired use of hands, diagnosis of abnormally high cholesterol, current psychotropic medication use, presence of any AIDS-defining illness and lifetime history of stroke (Fig. 1). The number at the top (within each box) in Fig. (1) is the mutual information (predictive importance) between the outcome (neurocognitive impairment shown in the center) and each covariate. The middle number is the relative mutual information with regard to the child node (i.e., the amount of uncertainty reduced regarding the parent node by knowing the child node). The bottom number is the relative mutual information with regard to the parent node (i.e., the amount of uncertainty reduced regarding the child node by knowing the parent node). The arrows' direction is dependent on computational complexity and does not reflect causality.
Among the variables included in the Markov blanket of the target variable (neurocognitive status), age provided the highest amount of mutual information (0.0333) and lifetime history of stroke the least (0.0088). Specifically, by knowing age, the uncertainty regarding neurocognitive status was reduced by 4.4% on average (Fig. 1). The K-fold cross-validation yielded an overall accuracy of 78%. The MB model generated for the younger participants (<50 years) showed that, except for history of stroke, all other predictors remained the same (Fig.  S1) as the overall model, whereas the MB model for older participants (≥50 years) included the history of stroke but not four other variables (Fig. S2).
Results of the multiple logistic regression indicated age to have the strongest association with NCI. Specifically, the adjusted odds ratio of NCI among those above 50 years of age was 2.77 times (95% CI=1.99-3.85) those at or below 50 years of age ( Table 1).  2 It includes diagnosis of Cryptococcus (extra pulmonary), Cytomegalovirus disease (other than liver, spleen or nodes) Cytomegalovirus retinitis, HIV related encephalopathy, herpes simplex, disseminated histoplasmosis, Kaposi sarcoma, Burkitt's lymphoma, disseminated mycobacterium avium complex, any site mycobacterium tuberculosis, Pneumocystis carinii pneumonia, recurrent pneumonia and Progressive multifocal leukoencephalopathy. 3 HAART= Using highly active antiretroviral therapy at baseline visit and Non-HAART= using antiretroviral medication but not a HAART regimen at baseline. *Variables measured at baseline visit The last three variables are listed in the table for completeness History of high cholesterol, current psychotropic drug use, history of stroke, history of any AIDS-defining illness, current difficulty eating, dressing, bathing, or using the toilet, and impaired use of hands were also positively associated with NCI ( Table 1). Those employed and those who reported a history of cocaine use had lower odds of NCI ( Table 1). Since age had the highest adjusted regression coefficient (β=1.02; equivalent to OR=2.77 after exponentiation), it was assigned 100 points in the nomogram (Fig. 2).
The points per variable are added together for each person and then converted into a probability of having NCI. For example, for a non-Hispanic white, aged below 50 years, with no employment, current use of psychotropic drugs, AIDSdefining illness, history of stroke, impaired use of hands and difficulty eating, dressing, bathing, or using the toilet and positive histories of high cholesterol and lifetime cocaine use will have total points of 146 converted to a probability of approximately 0.13. The highest possible points on the nomogram were 626, translated to a nomogram-predicted probability of NCI to be approximately 0.95. The receiver operating characteristic (ROC) curve's concordance index was 0.75 (Fig. 3a), and there was excellent agreement between observed and predicted probabilities, as shown by the nomogram's calibration plot (Fig. 3b). Fig. (2). Nomogram for predicting the probability of neurocognitive impairment among HIV-infected participants of the CNS HIV Antiretroviral Therapy Effects Research (CHARTER) study (n=1,307).

DISCUSSION
Despite the availability of potent antiretroviral medications, mild to moderate neurocognitive impairment persists in PLWH [33]. There is limited agreement in the literature about the risk factors for NCI in PLWH. Our first objective was to use a large sample of HIV-infected patients with NCI status assessed through a comprehensive battery of neuropsychological tests to examine the factors associated with NCI. The search for an optimal screening tool for NCI in PLWH is still ongoing. Traditional neuropsychological test batteries are lengthy, time-consuming, and require trained psychometrists and thus may not be feasible for use in primary care clinics. We hypothesized that the probability of having HIV-related NCI during a clinic visit may be assessed using a predictive tool based on demographic, behavioral, and clinical factors and thus created a nomogram.
The results of both Bayesian network analysis and multivariable logistic regression demonstrated age to be the most important predictor of NCI (highest mutual information and highest adjusted odds ratios). As we used corrected (adjusted for age, education, and other variables) T-scores based on standardized normative data, the association specifically embodies changes beyond the normal age-related neurocognitive function. The association between age and NCI among PLWH is crucial as HIV has been found to lead to premature and accelerated aging [47]. However, as we did not use an HIV-negative control group, we cannot comment on the interaction between HIV status and age in association with NCI. Our finding regarding age is in line with a recent USbased study that found older age (>50 years) to be associated with weaker overall cognitive performance among PLWH [48]. Another recent systematic review of HIV-infected adults reported that the odds ratio (OR) of having NCI among older participants compared to their younger counterparts varied between 1.18 and 4.8 [49]. The T-scores correction might explain why education dropped out of the model and nomogram during analyses.
Interestingly, cocaine use was associated with lower odds of NCI (OR=0.46, 95% CI=0.33-0.63). As we had combined lifetime cocaine abuse and lifetime cocaine dependence to generate the cocaine use variable, we may have underestimated the findings in active drug users. A possible explanation is that some participants might have used cocaine only in the past, and this may account for the reversal of cocaine use effects. The literature exhibits mixed results regarding the association between NCI and cocaine use among PLWH. Meade et al. did not find any association between current (past three months) cocaine use and a global neurocognitive measure (Global Deficit Scores); however, they did find higher impairment in processing speed and executive functioning among cocaine users compared to non-users [24]. Similarly, Attonito et al. did not find any association between cocaine use and neurocognition among 370 HIV-positive participants living in Florida [50]. Interestingly, another study using CHARTER data combined lifetime and current use of cocaine to define cocaine use variable, and although it did not detect any association between global impairment and cocaine use, they did report a weak association between cocaine use and improved verbal fluency [51].
Although limited research has been done on the topic, the inverse association between employment and NCI found in our study is consistent with the literature. Blackstone et al. demonstrated higher unemployment among those with NCI, and Rabkin et al. found that impairment in executive functioning represented significant employment barriers in HIV-infected men [52,53]. Not only may NCI render PLWH unable to be efficiently employed, but unemployment itself may also reduce neurocognitive ability, as being employed provides cognitive stimulation, facilitating enhanced cognitive functioning [54].
We ascertained difficulty in daily living by combining difficulty in bathing, dressing, eating, or using the toilet and found a positive association with NCI. The finding is not surprising, as an ample body of literature indicates that HIVassociated neurocognitive disorders are a significant risk factor for everyday functioning decline [6 -8, 55]. Our result that Hispanics and non-Hispanic Whites have higher adjusted odds of NCI than non-Hispanic African Americans is generally in contrast with prior reports of higher impairment among African Americans [56 -58]. However, prior studies differ from ours regarding the included samples (gender or ART use specific) and NCI measures. Our finding may be sample dependent as it was a multicenter study and had voluntary participation; however, stratification by individual sites showed similar findings across sites (results not shown). Heaton et al., in a longitudinal analysis using CHARTER data, also found higher neurocognitive decline (RR=2.35) among Hispanics compared to non-Hispanics (Whites and African Americans combined). Previous studies have demonstrated that Hispanic adults are diagnosed late for HIV and thus have a delayed HIV care initiation [59,60], which may be an explanation of our finding.
Among the comorbidities, we found impaired use of hands, abnormally high cholesterol, current psychotropic drug use, presence of any AIDS-defining illness, and lifetime history of stroke to be positively associated with NCI. The specific reasons for hand impairment were not available. Hand impairment may result from HIV-induced brain injury, aging, external trauma, systemic disease, or other CNS infections and vascular diseases [61 -64]. Thus, in some participants with hand impairment, poor motor-domain performance may not truly represent NCI and may need further evaluation. Previous studies have consistently found hypercholesterolemia to be positively associated with NCI in PLWH. A recent longitudinal study using CHARTER data found that participants with declining cognition exhibited a higher baseline cholesterol/HDL ratio compared to patients with stably normal cognition [65]. Another longitudinal study established elevated cholesterol to be an independent risk factor for cognitive decline in ART-adherent HIV-infected men [66]. The most plausible explanation for this association may be HIV-induced endothelial dysfunction leading to cholesterol oxidation and thus sub-clinical cerebrovascular damage [67 -69]. Psychotropic drug use may commonly indicate a diagnosis of anxiety, depression, psychosis, or sleep disorders, and hence an association with poor cognition. Although the literature is limited for other conditions, depression has been amply studied and, according to a recent comprehensive review, was associated positively with NCI in PLWH [70]. Depression may act directly or indirectly through poor ARV adherence to affect neurocognition in PLWH [70]. The presence of AIDS-defining comorbidities indicates higher severity of HIV infection and is rather plausibly associated with NCI. In addition to HIV infection acting itself, certain AIDS-defining infections such as HIV-related encephalopathy, disseminated histoplasmosis, disseminated mycobacterium avium complex, and progressive multifocal leukoencephalopathy may directly affect the brain leading to impairment [71]. The positive association between AIDS-defining comorbidities and HIV disease severity may explain why, when this variable was included in the Bayesian or logistic regression models, other variables such as HIV duration and the viral load dropped out.
Specialists have recommended conducting routine and regular screening for NCI for early detection, treatment adjustment, and management in PLWH. However, most screening tools are based on neurocognitive testing that requires clinicians to be suitably trained in its administration and interpretation. Furthermore, currently available screening tools are generally unable to detect milder forms of NCI [72,73]. We created a nomogram based on easily recordable demographic, clinical, and behavioral factors to serve as a userfriendly screening tool and predict NCI in PLWH. There is one study by Cysique et al. that attempted to develop a screening algorithm with demographic and clinical factors using support vector machine methodology [74] using a sample of 97 HIVinfected individuals on CART and with advanced HIV infection. The study included age, current CD4 cell count, past central nervous system HIV-related diseases, and current treatment duration in the algorithm. However, they did not convert their algorithm into a ready-to-use screening tool. Another study by Muñoz-Moreno et al. used a tree-structured approach and created four classification models to predict NCI in PLWH and identified age, employment status, CD4 cell count, highest viral load, comorbidities, HIV duration, and duration of current treatment as potential predictors [75]. However, their statistical approach (classification and regression trees) differed from our Bayesian network analysis.
Our study had some limitations. Although exclusion criteria were kept to a minimum to maximize generalizability, this was a clinic-based and volunteer sample. The proposed nomogram needs to be validated in other PLWH samples. This study is one of its kind, conducted at six centers in the US with information available on a wide range of demographic, behavioral, clinical, and laboratory measures and NCI status assessed through a comprehensive neuropsychological test battery. Some individuals with medical pathologies other than NCI could have ADL impairment. Thus, a disadvantage of including ADLs as a predictor of NCI is a potentially higher number of false positives during the screening nomogram. However, an advantage is that more individuals with NCI would have been false negatives had ADLs not been included as a predictor.

CONCLUSION
In conclusion, we investigated factors associated with NCI in PLWH and developed a novel nomogram to predict NCI in the HIV-infected population. The nomogram used variables that can be easily measured in clinical settings and, thus, easy to implement within a clinic or web-interface platform. Our goal with such a tool is to help clinicians predict specific patients who might have a high probability of NCI and be further evaluated by a comprehensive neuropsychological examination resulting in timely diagnosis and appropriate management. Future research should focus on external validation of the nomogram in different populations to assess broader applicability. Distinct nomograms may also be developed for specific patient subgroups such as older and younger patients.

AUTHORS CONTRIBUTIONS
The conceptualization and designing of the study were performed by ZN, HSF, LB, PM, and CMA. Data were acquired by HSF. Data analysis was performed by ZN, LB, CSW, and JM. The manuscript was drafted by ZN, LB, and PM. Review and revision of the article were performed by ZN, HSF, LB, CSW, PM, CMA, and JM. All authors have read the manuscript and approved its contents.

ETHICS APPROVAL AND CONSENT TO PARTICIPATE
This study was approved by the University of Nebraska Medical Center's institutional Review Board (Protocol #282-13-EP). All the procedures were approved by the Human Subjects Protection Committees of each participating institution.

HUMAN AND ANIMAL RIGHTS
No animals were used in this research. All human research procedures were followed in accordance with the ethical standards of the committee responsible for human experimentation (institutional and national), and with the Helsinki Declaration of 1975, as revised in 2013.

CONSENT FOR PUBLICATION
Written and oral informed consent was obtained from the patient and his parents to publish his records.

AVAILABILITY OF DATA AND MATERIALS
The data supporting the findings of the article are available by submitting a request and signing a data use agreement at https://nntc.org/content/requests.

FUNDING
The study was conducted under the auspices of the NNTC, which is supported by the National Institutes of Health (NIH) under Award Numbers U24MH100925, U24MH100928, U24MH100929, U24MH100930, and U24MH100931.

STANDARDS OF REPORTING
We confirm that the STROBE guidelines from the EQUATOR network were followed for our study.

CONFLICT OF INTEREST
The authors declare no conflict of interest, financial or otherwise.