Research Article | DOI: https://doi.org/10.31579/2835-8147/074
Problems of rating scales in health measurements
- Satyendra Chakrabartty 1
1 Indian Statistical Institute, India.
*Corresponding Author: Satyendra Chakrabartty, Indian Statistical Institute, India.
Citation: Satyendra Chakrabartty, (2024), Problems of rating scales in health measurements, Clinics in Nursing; 3(4): DOI: 10.31579/2835-8147/074
Copyright: © 2024 Satyendra Chakrabartty, this is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Received: 05 July 2024 | Accepted: 15 July 2024 | Published: 24 July 2024
Keywords: patient-reported scale; linear transformation; normal distribution; ability to detect changes; elasticity
Abstract
Background: Patient-reported outcomes (PROs) using multi-item rating scales are not comparable due to different features of the scales, different factors under consideration, etc.
Objectives: To discuss methodological limitations of PROs and to provide a method for converting ordinal item-score to follow normal distribution.
Method: Converting raw item-score to equidistant score (E) followed by standardization to Z-scores ~N(0,1) and converting Z-scores to proposed scores (P_i) in the range 1 to 100. Scale scores (P_Scale) as sum of P_i's and battery scores (B-scores) as sum of scale scores follow normal distribution.
Results: Each of P_Scale-scores and B-scores satisfy desired properties, helps undertaking parametric analysis, comparing status and finding equivalent scores of two PROs having implications in classification and also to get reliability, validity in better fashion.
Conclusion: The suggested method contributing to improve scoring of PRO instruments with additional benefits of identification of poorly performing scales, assessment of progress across time is recommended.
Introduction:
Often subjective self-reported measures of illness are evaluated through rating scales to assess objective health [2] Data resulting from such rating scales are categorical and in ordinal level. Large numbers of clinical researches use patient reported rating scales (PROs) to quantify clinical conditions like intensity of disease, effects of disease or treatment, health status, quality of life (QoL), pain, sleep disorders, depression, anxiety, stress and far beyond as part of the patient decision making process. The MAPI Trust, a nonprofit organization provides information on 3000+ patient rating scales ( http://www.mapi-trust.org/about-the-trust).
PROs consist of number of scales which vary in terms of features of the scales like number of items (scale length), number of levels (scale width), scoring methods, etc. and are not comparable. Scale length, scale width, frequencies of levels affect differential item functioning (DIF). Analysis of ordinal data emerging from PROs without satisfying the assumptions of statistical techniques used, may distort the results. [22] suggested prior checking of measurement properties of PRO-instruments.
Methods:
Self-reported rating scale consisting of multi-point items suffers from methodological limitations including not meaningful addition. If addition is not meaningful, computations like standard deviation (SD), correlation, Cronbach α, etc. are meaningless. Statistical analysis like regression, Principal component analysis (PCA), Factor analysis (FA), testing equality of means by t-test or ANOVA assumes normal distribution of the variables under study. But questionnaire scores violate the assumption and may distort the results. Assigning equal importance to items and constituent scales in summative scoring of PROs is not justified since contributions of items or scales to total battery score, values of inter-item correlations, scale-battery correlations and factor loadings are different [25] Mean, SD, Cronbach alpha tend to increase with increase in number of levels and may influence mean more than the underlying variable [18] No consensus is there regarding number of levels per item in rating scales [5]
Studies attempting to evaluate effect of selenium supplementation on stroke used different definitions of stroke either by categorical variable or variables in ratio scale. While investigating dose-response correlation between dietary selenium intake and stroke risk, [29,30] used self-reported single question "Has a doctor ever told you that you had a stroke?" to define Stroke. Thus, stroke was taken here as a categorical variable and not in ratio scale. [39] asked each participant whether a doctor ever given a diagnosis of stroke (no, yes, unknown) and defined stroke as a self-reported physician diagnosis during follow-up. The follow-up time was the date of the first discovery of stroke. [28] included adults with accepted ischemic stroke by neuroimaging during the last 72hrs with a volume of at least one-third of MCA territory. Different inclusion criteria for stroke and different analysis resulted in different relationships between intake of selenium supplementation on stroke and conclusions.
The paper suggests a method of transforming ordinal scores of i-th item to normally distributed proposed scores (P_i-scores) facilitating meaningful addition and deriving scale score (P_Scale) as arithmetic aggregation of P_i-scores satisfying desired properties, enabling assessment of progress and parametric analysis.
Problems of rating scales:
If distance between two successive response-categories or levels of K-point items (K= 2, 3, 4, 5 ……) is denoted by d_(j,(j+1) ) then d_(j,(j+1) )≠d_((j+1),(j+2) )  ∀ j =1, 2, 3, 4… i.e. scores are not equidistant [27] Thus, addition of ordinal item scores is not meaningful [15]  and even (X ) ̅  > or  Generic or disease-specific multidimensional rating scales for QoL may not consider all relevant constructs. For example, Disease-specific stroke adapted 30-item SIP version (SA-SIP30) with 8 subscales excludes domains like recreation, energy, pain, general health perceptions, overall quality of life or stroke symptoms [11] Multidimensional rating scales may even fail to give global summary like 36-Item Short Form Health Survey questionnaire (SF-36) (http://www.webcitation.org/6cfeefPkf) Multidimensional scale covers a number of sub-scales/dimensions where scale formats are different for different sub-scales. For example, SF-36 has 10 (3-points) items on Physical functioning, 3 (6-point) items on Energy/Fatigue, 2 (5-point) items on social functioning, 6 (6-points) items on Emotional well-being, 5 (5-point) items on General health, two items on Pain (one 6-point and one 5-point), seven binary items and another item regarding reported health transition over the last year. The set-up indicates (i) different distributions for binary items, 3-point, 5-point, 6-point items, (ii) higher mean, SD of sub-class containing 6-point items, (iii) different reliability, validity, for different sub-classes. [26] Two distinct concepts measured by the SF-36 are Physical Component Summary (PCS), and Mental Component Summary (MCS). [35]. found paradoxical inverse relationship between PCS and MCS which implies good physical condition pre-supposes poor mental health and vice versa. SF-36 was negatively correlated with Patient Health Questionnaire (PHQ) and General anxiety disorder questionnaire (GAD-7) probably due to different factors measured by them [16] Scoring methods of PROs are different. Dimension score of MacNew Heart Disease Health–Related Quality of Life Questionnaire (MacNew) is based on mean of the responses in items belonging to the dimension but, Cardiovascular Limitations and Symptoms Profile (CLASP) scores consider weights to find total for each subscale. Each dimension of Myocardial Infarction Dimensional Assessment Scale (MIDAS) is scored separately.   No clear understanding of factors being measured. Against two factors proposed in the Hospital Anxiety and Depression Scale (HADS), factor structure of the instrument was found to be three in a range of clinical populations [3] against recommending HADS as a one-dimensional measure [8] and statistical evidence for a three-factor structure [33] Similarly, for Psychological General Well-Being Index (PGWBI), [21] found single construct of psychological wellbeing against underlying six factors of the scale raising questions about factor analytic interpretation in the presence of local dependency. Use of zero as an anchor value does not allow computation of expected value (value of the variable × probability of that value), reduces mean and SD of the scale, item-total correlations, affects regression or logistic regression, etc.  If each respondent of a sub-group selects the level marked as “0” to an item then mean = variance = 0 for the sub-group and correlation with that item is undefined. [34] found more than 40% of the patients scored zero in 10 subscales of Sickness Impact Profile (SIP) and in one subclass of SF-36. Better is to mark the anchor values as 1, 2, 3… and so on, keeping the convention of higher score ⇔ higher value of the variable being measured.   Higher score in each of Nottingham Health Profile (NHP), Minnesota Living with Heart Failure (MLHF) indicate higher health problems, unlike Sickness impact Profile (SIP). Thus, directions of scores are different for different scales.   Rating data with floor and ceiling effects follow unknown distribution and do not satisfy the assumption of PCA like bivariate normality for each pair of observed variables, normally distributed scores, etc.   Test reliability by Cronbach alpha assumes one-dimensional scale and tau-equivalence (equality of all factor loadings). Multidimensional PROs like Insomnia Severity Index (ISI), Pittsburgh Sleep Quality Index (PSQI) and Insomnia Symptom Questionnaire (ISQ), etc. violate the assumption and underestimate the coefficient alpha [9] The coefficient alpha is influenced by variance sources, sampling errors [27] sample size [7] and even test length and test width [20]   Validity of a multidimensional scale as correlation with criterion scores raises the question about the dimension /factor being reflected by the validity. It is desirable to find the validity of the main factor for which the scale was developed and also to derive relationship between test reliability and test validity. Vaughan, (1998) found lower validity where data contained predominantly high performers. To avoid such problems, structural validity of normally distributed transformed scores by PCA was preferred [4-6] Different cut-off scores are there for different PROs. For example, cut-off score of Sickness Impact Profile (SIP136) with 136 “Yes–No” type items distributed over 12 domains is ≥ 22 and for Stroke-Adapted Sickness Impact Profile (SA-SIP30) with 30 items covering 8 subscales is ≥33.  Natural question is whether score of 33 in SA-SIP30 is equivalent to the score of 22 in SIP136.  Similarly, score of 14 in ISI indicating “no insomnia” is equivalent to which score in PSQI or ISQ?  Thus, finding equivalent scores of two scales can make better comparisons of the PROs for the purposes of classification of individuals. For QoL questionnaires, there could be no cut-off point to show better or worse QoL [31] Based on treatment status for Cancer Core Questionnaire (EORTC QLQ-C30), four different cut-off scores were found [17] Intra- and-inter observer reliability of ordinal scale like Kessler Psychological Distress Scales (K 6 and K 10) are evaluated by Kappa and weighted Kappa. Major limitations in this context are:   A low kappa does not imply low agreement [1] Confidence interval for Kappa ≤0.60 may indicate large volume of incorrect evaluation of data [32] Methods of deciding weights for weighted kappa vary and may give different values of weighted kappa.   Concepts of agreement in terms of κ or κ_Weighted are different from the concept of reliability of tests/scales. Suggested method: Let X_ijbe the raw score of a respondent in the i-th item for choosing the j-th level where the levels are marked as 1, 2, 3, 4, …. avoiding zero and higher value of X_ij  implies higher dysfunctions or impairments. The suggested method transforms ordinal item scores (X_i) to equidistant scores (E_i) and further transformation to proposed scores (P_i-scores) in the score range [1, 100] following (μ_i,σ_i) facilitating meaningful addition to derive scale score (P_Scale) as sum of P_i-scores. The method is described below. For the i-th item, find maximum frequency f_(i.Max)and minimum frequency f_(i Min).  For n-number of respondents in a 5-point item (say), find initial weights ω_i1=f_(i Min)/n, the common differenceα=(5f_(i.Max)- f_(i Min))/4n and other initial weights as ω_i2=(ω_i1+α)/2, ω_i3=(ω_i1+2α)/3,ω_i4=(ω_i1+3α)/4, and ω_i5=(ω_i1+4α)/5.   Take final weights W_ij= ω_ij/(∑_(j=1)^5▒ω_ij ) Here, ∑_(j=1)^5▒W_ij =1. Here, W_ij's form an arithmetic progression. Generated scores E_ij= W_ij X_ij  are continuous, monotonic and equidistant.   Standardized equidistant scores (E) of each item as ???? =(E- E ̅)/(SD(E)) ∼ N (0, 1) and   P_i  =  ((99)*(Z_i- Min(Z_i )))/(Max (Z_i )- Min(Z_i ) ) + 1 ∼ N (μ_i,σ_i) where 0≤  P_i≤100 irrespective of length of scale and width of items.   Normality of item scores (P_i's ) facilitates meaningful addition and the resultant scale scores P_Scale = ∑_i▒P_i  as the convolution of P_i's . Normally distributed P_Scale-scores can be added to get battery score (B-scores) also following normal.   Major properties of P_Scale-scores and B-scores are:                                                             Each avoids equal importance to items and dimensions and represents continuous, monotonically increasing scores.   The zero point for scoring K-point items to get E-scores is obtained whenf_ij=0. Other items in ratio scales can be standardized and transformed to follow normal distribution in the range [1, 100] and added with P_i's   Contribution of j-th scale to the battery can be found by (P_(jth Scale)-score)/(B-scores).    
Benefits:
Parameters of distributions of P_Scale-scores and B-scores can be estimated from data. Normality enables estimation of population mean (μ), population variance (σ^2), confidence interval of μ, testing statistical hypothesis like H_0: μ_1=μ_2 or H_0: σ_1^2=σ_2^2 etc.
Based on battery scares, progress of i-th patient in t-th period over the previous period by (B_(i(t))-B_(i(t-1)))/B_(i(t-1)) ×100. Decline is indicated in case of B_(i(t))-B_(i(t-1))<0> (B_(i(t-1)) ) ̅ indicates progress. Similarly, progress with respect to scores of P_Scalecan be computed. Decline if any, may be probed to find the critical scale(s) where P_(Scale(t))-P_Scale(t-1) <0>
Effect of small change in i-th scale (P_(i-th Scale )) to Battery score B-scores can be quantified by considering elasticity i.e. percentage change of B-scores due to small change inP_(i-th Scale). The scales can be ranked based on such elasticity. Elasticity studies in economics, reliability engineering, consider model like logQ_jt=α_j+β_j logP_jt where Q_jt denotes the quantity demanded of j-th industry at time t and P_jt is industry price relative to the price index of the economy However, for normally distributed P_Scale-scores and B-scores, logarithmic transformations are not required to fit regression equation of the form P_Scale = α_i + β_i P_i + ε_i
The coefficient β_i reflects the impact of a unit change in the independent variable (i-th dimension) on the dependent variable (P_Scale). Policy makers can decide appropriate actions in terms of continuation of efforts towards the scales with high values of elasticity and corrective actions for the dimensions with lower elasticity i.e. areas of concern.
Normality of B-scores facilitates testing H_0: μ_(B_t ) = μ_(B_((t-1)) ) reflecting effectiveness of the treatment plans and H_0: 〖Progress〗_((t+1)over t) = 0, reflecting progression
Graph depicting progress/decline of one patient or a group of patients with similar socio-demographic profile is analogous to hazard function and helps to identify high-risk groups and compare response to treatments from the start.
For two scales X and Y with normal pdf f(x)and g(y) respectively, equivalent score y_0 for a given value say x_0 can be found by solving the equation ∫_(-∞)^(x_0)▒〖f(x)dx=∫_(-∞)^(y_0)▒g(y)dy〗 using standard normal table even if the scales have different lengths and widths [4-6]
P-scores and B-scores following normal distributions satisfy the assumptions of PCA, FA and enable finding Factorial (FV) = λ_1/(∑▒λ_i ) = λ_1/(∑▒S_(X_i)^2 ) where λ_1 the highest eigenvalue indicating validity for the main factor being measured [24] The test significance of λ_1 can be undertaken using the Tracy–Widom (TW) test statistic U = λ_1/(∑▒λ_i ) following TW-distribution [23] Such FV avoids the problems of construct validity and selection of criterion scale ensuring matching constructs and two administrations of the scale and the criterion scale.
For standardized item scores, 〖FV〗_(Z-scores) of a test with m-items is λ_1/m and the test variance S_X^2 can be written as S_X^2= ∑▒λ_i + 2∑_(i≠j=1)^m▒〖Cov(X_i,X_j)〗= λ_1/FV+2∑_(i≠j=1)^m▒〖Cov(X_i,X_j)〗 [1]
The equation (1) can be used to find the theoretical reliability
r_(tt(theoretical)) = (S_T^2)/(S_X^2 )= (S_T^2 )/(λ_1/FV+2∑_(i≠j=1)^m▒〖Cov(X_i,X_j)〗) [2]
Equation (2) gives relationship between r_(tt(theoretical)) and factorial validity, which is non-linear.
[30] suggested maximum reliability of a test by α_(PCA ) which can be derived from the correlation matrix of m-number of items by
α_PCA= (m/(m-1)) ( 1-1/λ_1 ) [3]
Relationship between FV and α_PCA can be derived as:
α_PCA= (m/(m-1)) ( 1-1/λ_1 ) = (m/(m-1)) ( 1-1/(FV.∑▒λ_i )) = (m/(m-1)) ( 1-1/(m.〖FV〗_(Z-scores) )) [4]
As per (4), higher value of 〖FV〗_(Z-scores) increases α_PCA
Cronbach alpha of a battery consisting of K-scales can be obtained as a function of scale reliabilities by α ̂_Battery = (∑_(i=1)^K▒r_(tt(i)) S_Xi+ ∑_(i=1,i≠j)^K▒∑_(j=1)^K▒〖2COV(X_i,X_j)〗)/(∑_(i=1)^K▒S_Xi + ∑_(i=1,i≠j)^K▒∑_(j=1)^K▒〖2COV(X_i,X_j)〗) [5]
where r_(tt(i)) and S_xi denote respectively reliability and SD of the i-th scale.
Discussion:
The suggested method defines meaningful scale scores and battery scores for each individual. Each of P_Scale-scores and B-scores satisfy desired properties, helps undertaking parametric analysis, comparing status and progression of patients including indication of effectiveness of treatment plans, finding equivalent scores of two patient reported scales (PROs) where area under normal curve corresponding to PRO-1 up to P_(PRO-1)^0 = area under normal curve corresponding to PRO-2 up toP_(PRO-2)^0. For classification of individuals, equivalent cut-off scores of class boundaries may be found satisfying 〖Var.of group〗_(Score ≥ P_(PRO-1)^0 )/(Variance of PRO-1)=〖Var.of group 〗_(Score ≥P_(PRO-2)^0 )/(Variance ofPRO-2) which may facilitate to have similar efficiency of classification, in terms of within group variance and between group variance.
Factorial validity (FV) reflecting the main factor being measured helps to have a clear understanding of the most important factor being measured. However, establishing clinically meaningful content validity is a vital step. Maximum value of test reliabilityα_PCA, relationship between 〖FV〗_(Z-scores) and α_PCA and also between r_(tt(theoretical)) and FV can be used effectively to compare scales. The scales with eigenvalues exceeding unity can be retained keeping in view that results may get distorted by wrong selection of constituent scales.
Conclusions:
The suggested B-scores reflecting disease severity with respect to the PRO measures is recommended with the scales chosen as per the selection criteria mentioned above. Future empirical investigations may be undertaken to evaluate properties of the suggested method and its clinical validation along with effects of sociodemographic factors.
Declarations:
Acknowledgement: Nil
Conflicts of interest/Competing interests: The author has no conflicts of interest to declare
Funding: Did not receive any grant from funding agencies in the public, commercial, or not-for-profit sectors.
Ethical approval: Not applicable since the paper does not involve human participants.
Consent of the participants: Not applicable since the paper does not involve data from human participants
Data Availability statement: The paper did not use any datasets
Code availability: No application of software package or custom code
CRediT statement: Conceptualization; Methodology; Analysis; Writing and editing the paper by the Sole Author
References
- Bajpai, S., Bajpai, R. and Chaturvedi, HK. (2015). Evaluation of Inter-Rater Agreement and Inter-Rater Reliability for Observational Data: An Overview of Concepts and Methods, Journal of the Indian Academy of Applied Psychology, 41(3), 20-27
 View at Publisher | View at Google Scholar
- Bourne PA. (2009). The validity of using self-reported illness to measure objective health. N Am J Med Sci.; 1(5):232–238.
 View at Publisher | View at Google Scholar
- Caci LA, Bayle FJ, Mattel V, Dossios C, Robert P, Boyer P. (2003). How does the Hospital Anxiety and Depression Scale measure anxiety and depression in healthy subjects? Psychiatry Report; 118: 89–99. 10.1016/S0165-1781(03)00044-1
 View at Publisher | View at Google Scholar
- Chakrabartty, Satyendra Nath (2021). Integration of various scales for Measurement of Insomnia. Research Methods in Medicine & Health Sciences, 2(3), 102-111, DOI: 10.1177/26320843211010044
 View at Publisher | View at Google Scholar
- Chakrabartty SN (2020). Improved Quality of Pain Measurement, Health Science, Vol. 1, 1 -6, DOI: 10.15342/hs.2020.259
 View at Publisher | View at Google Scholar
- Chakrabartty, SN and Gupta, R. (2016). Test Validity and Number of Response Categories: A Case of Bullying Scale, Journal of the Indian Academy of Applied Psychology; 42(2); 344-353
 View at Publisher | View at Google Scholar
- Charter RA. (1999). Sample size requirements for precise estimates of reliability, generalizability, and validity coefficients. Journal of Clinical and Experimental Neuropsychology, 21(4), 559–566.https://doi.org/10.1076/jcen.21.4.559.889
 View at Publisher | View at Google Scholar
- Costantini M, Musso M, Viterbori P.et al.(1999). Detecting psychological distress in cancer patients: validity of the Italian version of the Hospital Anxiety and Depression Scale. Support Care Cancer; 7: 121-127
 View at Publisher | View at Google Scholar
- Daniel, Wayne W. (1990): Friedman two-way analysis of variance by ranks. Applied Nonparametric Statistics (2nd ed.). Boston: PWS-Kent. 262–274. ISBN 978-0-534-91976-4. 
 View at Publisher | View at Google Scholar
- Ding J, Zhang Y. (2021). Relationship between the circulating selenium level and stroke: a Meta-analysis of observational studies. J Am Coll Nutr.;1–9. doi: 10.1080/07315724.2021.1902880.
 View at Publisher | View at Google Scholar
- Golomb BA, Vickrey BG, Hays RD. (2001). A review of health-related quality-of-life measures in stroke. Pharmacoeconomics; 19(2):155-185
 View at Publisher | View at Google Scholar
- Hadrup N, Ravn-Haren G. (2020). Acute human toxicity and mortality after selenium ingestion: a review. J Trace Elem Med Biol.; 58:126435
 View at Publisher | View at Google Scholar
- Hand DJ. (1996). Statistics and the Theory of Measurement, J. R. Statist. Soc. A; 159, Part 3,445-492. 
 View at Publisher | View at Google Scholar
- Hu XF, Stranges S, Chan L. (2019). Circulating selenium concentration is inversely associated with the prevalence of stroke: results from the Canadian Health Measures Survey and the National Health and Nutrition Examination Survey. J Am Heart Assoc.; 8(10): e012290.
 View at Publisher | View at Google Scholar
- Jamieson, S. Likert scales (2004). How to (ab) use them. Medical Education,38; 1212 -1213
 View at Publisher | View at Google Scholar
- Johnson SU, Ulvenes PG, Øktedalen T, Hoffart A. (2019). Psychometric Properties of the General Anxiety Disorder 7-Item (GAD-7) Scale in a Heterogeneous Psychiatric Sample. Front Psychol;10: 1713.doi:10.3389/fpsyg.2019.01713
 View at Publisher | View at Google Scholar
- Kyte DG, Calvert M, Vander Wees PJ, Ten Hove R, Tolan S, Hill JC. (2015). An introduction to patient-reported ontcome measures (PROMs) in physiotherapy. Physiotherapy; 101 (2); 119-125 Lidington E, Giesinger JM, Janssen SHM, Tang S, Beardsworth S, Darlington AS et al. (2022). Identifying health-related quality of life cut-off scores that indicate the need for supportive care in young adults with cancer. Qual Life Res. 31, 2717–2727. DOI.10.1007/s11136-022-03139-6 
 View at Publisher | View at Google Scholar
- Lim HE. (2008). The use of different happiness rating scales: bias and comparison problem? Social Indicators Research, 87; 259–267. 10.1007/s11205-007-9171-x.
 View at Publisher | View at Google Scholar
- Lo CF (2012).  The Sum and Difference of Two Lognormal Random Variables. Journal of Applied Mathematics, Article ID 838397, doi:10.1155/2012/838397
 View at Publisher | View at Google Scholar
- Luh, Wei-Ming (2024). A General Framework for Planning the Number of Items/Subjects for Evaluating Cronbach’s Alpha: Integration of Hypothesis Testing and Confidence Intervals. Methodology; Vol. 20(1); 1–21, https://doi.org/10.5964/meth.10449
 View at Publisher | View at Google Scholar
- Lundgren-Nilsson, Å., Jonsdottir, I.H., Ahlborg, G. et al. (2013). Construct validity of the psychological general wellbeing index (PGWBI) in a sample of patients undergoing treatment for stress-related exhaustion: a rasch analysis. Health Qual Life Outcomes 11; 2. https://doi.org/10.1186/1477-7525-11-2
 View at Publisher | View at Google Scholar
- Mokkink LB, Terwee CB, Patrick DL, et al. (2010). The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study. Qual Life Res, 19;539–549. 
 View at Publisher | View at Google Scholar
- Nadler, Boaz (2011): On the distribution of the ratio of the largest eigenvalue to the trace of a Wishart matrix. Journal of Multivariate Analysis, 102; 363-371
 View at Publisher | View at Google Scholar
- Parkerson, H. A., Noel, M., Gabrielle M. P., Fuss, S., Katz, J., Gordon J. G. Asmundson (2013): Factorial Validity of the English-Language Version of the Pain Catastrophizing Scale–Child Version, The Journal of Pain,14(11);1383-1389. https://doi.org/10.1016/j.jpain.2013.06.004
 View at Publisher | View at Google Scholar
- Parkin D, Rice N, Devlin N. (2010). Statistical analysis of EQ-5D profiles: does the use of value sets bias inference? Med Decis Making, 30(5):556–565 
 View at Publisher | View at Google Scholar
- Preston, Carolyn C. and Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences, Acta Psychologica, 104;1-15
 View at Publisher | View at Google Scholar
- Rutter L. A. and Brown T. A. (2017): Psychometric properties of the generalized anxiety disorder scale-7 (GAD-7) in outpatients with anxiety and mood disorders. J. Psychopathol. Behav. Assess. 39;140–146.
 View at Publisher | View at Google Scholar
- Sharifi-Razavi A, Karimi N, Jafarpour H. (2022). Evaluation of Selenium Supplementation in Acute Ischemic Stroke Outcome: An Outcome Assessor Blind, Randomized, Placebo-Controlled, Feasibility Study. Neurol India;70(1):87-93. doi: 10.4103/0028-3886.336328. 
 View at Publisher | View at Google Scholar
- Shi W, Su L, Wang J, Wang F, Liu X and Dou J. (2022). Correlation between dietary selenium intake and stroke in the National Health and Nutrition Examination Survey 2003–2018, AnnalsofMedicine,54(1);1395–1402. https://doi.org/10.1080/07853890.2022.2058079
 View at Publisher | View at Google Scholar
- Shi L, Yuan Y, Xiao Y, et al. (2021). Associations of plasma metal concentrations with the risks of all-cause and cardiovascular disease mortality in Chinese adults. Environ Int.; 157:106808.
 View at Publisher | View at Google Scholar
- Silva PAB, Soares SM, Santos JFG, et al. (2014). Cut-off point for WHOQOL-bref as a measure of quality of life of older adults. Rev Saude Publica; 48:390–397. 10.1590/s0034-8910.2014048004912
 View at Publisher | View at Google Scholar
- Simundic AM. (2008). Confidence interval. Biochem Med, 18:154–161.
 View at Publisher | View at Google Scholar
- Strong V, Waters R, Hibberd C, et al. (2007). Emotional distress in cancer patients: the Edinburgh Cancer Centre symptom study. Br J Cancer; 96: 868-874
 View at Publisher | View at Google Scholar
- Stucki G, Liang MH, Phillips C, Katz JN. (1995): The Short Form-36 is preferable to the SIP as a generic health status measure in patients undergoing elective total hip arthroplasty. Arthritis Care Res.; 8(3):174-181.10.1002/art.1790080310
 View at Publisher | View at Google Scholar
- Taft, Charles & Karlsson, Jan & Sullivan, Marianne. (2001). Do SF-36 Summary Component Scores Accurately Summarize Subscale Scores? Quality of life research, 10(5); 395-404. DOI 10.1023/A:1012552211996.
 View at Publisher | View at Google Scholar
- Ten Berge JMF & Hofstee WK (1999): Coefficients alpha and reliabilities of unrotated and rotated components. Psychometrika, 64; 83–90. doi: 10.1007/BF02294321
 View at Publisher | View at Google Scholar
- Terry L, & Kelley K. (2012). Sample size planning for composite reliability coefficients: Accuracy in parameter estimation via narrow confidence intervals. British Journal of Mathematical & Statistical Psychology, 65(3); 371–401. https://doi.org/10.1111/j.2044-8317.2011.02030.x
 View at Publisher | View at Google Scholar
- Xiao Y, Yuan Y, Liu Y, et al. (2019). Circulating multiple metals and incident stroke in Chinese adults. Stroke, 50(7):1661–1668.
 View at Publisher | View at Google Scholar
- Zhang H, Qiu H, Wang S and Zhang Y (2023). Association of habitually low intake of dietary selenium with new-onset stroke: A retrospective cohort study (2004–2015 China Health and Nutrition Survey). Front. Public Health 10:1115908. doi: 10.3389/fpubh.2022.1115908
 View at Publisher | View at Google Scholar
 
    
 
             Clinic
Clinic 
                                    