GMS | GMS Psycho-Social-Medicine | A short form of the General Self-Efficacy Scale (GSE-6): Development, psychometric properties and validity in an intercultural non-clinical sample and a sample of patients at risk for heart failure

Abstract

Objective: General self-efficacy has been found to be an influential variable related to the adaptation to stress and chronic illness, with the General Self-Efficacy (GSE) Scale by Jerusalem and Schwarzer being a reliable and valid instrument to assess this disposition. The aim of this study was to construct and test a short form of this scale to allow for a more economical assessment of the construct.

Methods: The item characteristics of the original scale were assessed using an intercultural non-clinical sample (_n_=19,719). Six items with the highest coefficient of variation and good discrimination along the range of the trait were selected to build a short form of the instrument (GSE-6). Subsequently, the psychometric properties and the concurrent and predictive validity of the GSE-6 were tested in a longitudinal design with three measurements using a sample of patients with risk factors for heart failure (_n_=1,460).

Results: Cronbach’s alpha for the GSE-6 was between .79 and .88. We found negative associations with symptoms of depression (–.35 and –.45), anxiety (–.35), and vital exhaustion (–.38) and positive associations with social support (.30), and mental health (.36). In addition, the GSE-6 score was positively associated with active problem-focused coping (.26) and distraction/self-encouragement (.25) and negatively associated with depressive coping (–.34). The baseline GSE-6 score predicted mental health and physical health after 28 months, even after controlling for the respective baseline score. The relative stability over twelve and 28 months was _r_=.50 and _r_=.60, respectively, while the mean self-efficacy score did not change over time.

Conclusions: The six item short form of the GSE scale is a reliable and valid instrument that is useful for the economical assessment of general self-efficacy in large multivariate studies and for screening purposes.

Keywords: self efficacy, questionnaires, psychometrics, heart diseases

Zusammenfassung

Hintergrund: Die generalisierte Selbstwirksamkeit hat sich als einflussreiche Variable im Zusammenhang mit der Anpassung an stressreiche Situationen und chronische Erkrankungen gezeigt. Die Skala Generalisierte Selbstwirksamkeit (GSW) von Jerusalem und Schwarzer ist ein reliables und valides Instrument zur Erhebung dieser Disposition. Ziel der vorliegenden Studie war die Entwicklung und Pr�fung einer Kurzform dieser Skala, um eine �konomischere Erfassung des Konstrukts zu erm�glichen.

Methoden: Die Itemmerkmale der Original-Skala wurden anhand der Daten einer interkulturellen, nicht-klinischen Stichprobe (_n_=19.719) bestimmt. Sechs Items mit dem h�chsten Variationskoeffizienten und guter Diskrimination �ber den Merkmalsbereich wurden ausgew�hlt, um aus ihnen eine Kurzform des Instruments zusammenzustellen (GSW-6). Anschlie�end wurden psychometrische Merkmale und die konkurrente und pr�diktive Validit�t der GSW-6 in einem L�ngsschnittdesign mit drei Messzeitpunkten an einer Stichprobe von Patienten mit Risikofaktoren f�r eine Herzinsuffizienz gepr�ft (_n_=1.460).

Ergebnisse: Cronbachs alpha f�r die GSW-6 lag zwischen .79 und .88. Wir fanden negative Zusammenh�nge mit depressiven Symptomen (–.35 und –.45), Angstsymptomen (–.35), und vitaler Ersch�pfung (–.38) sowie positive Zusammenh�nge mit sozialer Unterst�tzung (.30) und psychischer Gesundheit (.36). Weiterhin hing der GSW-6-Score positiv mit aktivem problemorientiertem Coping (.26) und Ablenkung/Selbstaufbau (.25) sowie negativ mit depressiver Krankheitsverarbeitung (–.34) zusammen. Der GSW-6-Ausgangswert konnte die psychische und k�rperliche Gesundheit nach 28 Monaten, auch nach Kontrolle des jeweiligen Ausgangswertes, vorhersagen. Die relative Stabilit�t �ber zw�lf bzw. 28 Monate betrug _r_=.50 und _r_=.60, w�hrend sich der mittlere Selbstwirksamkeitsscore im Zeitverlauf nicht �nderte.

Schlussfolgerungen: Die aus sechs Items bestehende Kurzform der GSW-Skala ist ein reliables und valides Instrument, das zur �konomischen Erfassung der generalisierten Selbstwirksamkeit in gro�en multivariaten Studien und zum Einsatz als Screeninginstrument geeignet ist.

Schlüsselwörter: Selbstwirksamkeit, Fragebogen, Psychometrie, Herzerkrankungen

Introduction

Over the past years there has been increasing evidence for the role of personal dispositions like optimism, self-esteem, and self-efficacy as protective factors in adapting to stress and chronic illness [1], [2], [3], [4], [5], [6], [7]. Self-efficacy has been found to be positively associated with mobility, activities of daily living, and quality of life in stroke patients [3], [4], with family and social functioning in coronary heart disease patients [8], with self-care in young adults with type I diabetes [9], with physical well-being in coronary heart disease patients [8], osteoarthritis patients [1], and spinal cord injury patients [7], as well as with psychological well-being in cancer patients [2], [10], stroke patients [3], [4], osteoarthritis patients [1], and spinal cord injury patients [7].

In social-cognitive theory [11] the construct of self-efficacy refers to the belief that a person is able to control challenging environmental demands by taking adaptive action. Although generally thought of as being domain-specific, the concept of generalized self-efficacy, representing a broad and stable confidence in one’s ability to deal with different demanding situations, has been suggested [12]. The General Self-Efficacy (GSE) Scale, originally developed by Jerusalem and Schwarzer in 1979, aims to assess this general attribute. In the most recent version it consists of 10 items and has been adapted to several languages [13]. In different studies the scale showed internal consistencies between alpha = .75 and .94 [13], [14]. The retest-reliability was found to be between .47 and .75 for time periods ranging from 6 months to two years [13]. In a sample of high-school students positive correlations have been found with optimism and the perception of challenge in stressful situations, in a sample of teachers positive correlations with proactive coping, and self-regulation and negative correlations with procrastination and symptoms of burnout were found [13]. For clinical samples, negative correlations with depressive symptoms and anxiety as well as positive correlations with the use of adaptive, active coping strategies and quality of life measures have been reported [14], [15], [16].

Although the scale in its current form is already very efficient, in some cases an even shorter form of the questionnaire may be desirable for inclusion into large multivariate studies or for screening purposes [17]. The space and time savings gained by shortening the instrument are directly proportional to the reduction in length and have to be weighed up against the loss in reliability and validity to be expected. When constructing a short form of an assessment instrument and thus reducing the number of items, a central goal is to preserve the content coverage. Using the item-total correlation or the height of factor loadings as a criterion for the selection of the items tends to narrow the construct, thus compromising the validity, while keeping the reliability to the highest degree [17]. Instead of this, selecting items that capture the content of the construct to be assessed as broadly as possible and selecting items that are able to discriminate among participants at different levels of the trait are alternative criteria deemed more useful [18].

The aim of the current study is to construct a short form of the General Self-Efficacy Scale, using the item selection criteria as discussed above, and subsequently to test the psychometric properties of the new instrument, to determine its reliability and stability and to analyze its convergent, discriminant, and predictive validity.

Methods

Sample 1

Sample 1 consists of _n_=19,719 participants from 26 different countries, that responded to their respective language version of the GSE scale, with participants with missing response on any of the GSE items (_n_=177 of the total _n_=19,896) excluded. To build the sample, samples from different countries and different studies were combined. The composition of the subsamples and the respective data collection procedures vary. The data (without the Swiss sample, _n_=776), available as an SPSS file for free download at the GSE website [19], have previously been used to analyze the psychometric properties and cross-cultural differences of the general self-efficacy scale [13]. The sample consists of _n_=7,415 males and _n_=9,624 females (with _n_=2,680 not stating their gender), ranging in age from 12 to 94 years with a mean age (standard deviation) of 25.1 (14.5) years (missing data for age for _n_=4,489).

Instrument

The 10 item version of the GSE scale [12] in different translations was used. According to the FAQ document available at the GSE website [19], the scale can be reproduced and used without explicit permission in research studies, as long as the source of the scale is given appropriately. The participants were asked to rate the degree to which each item applies to them on a scale ranging from “not at all true” (1) to “exactly true” (4). For the summary score the item scores are summed up.

Analysis

To select the items for the short version, the psychometric properties of the items were determined using the results of sample 1. As to the numbers of items to select, the Spearman-Brown formula [20] can be used to estimate the reliability to be expected with a reduced number of items in comparison with the full number of items. If reliabilities of .90 respective .80 are assumed for the ten items of the original GSE scale a reduction to six items will reduce the reliability to .84 respective .71, which is deemed the maximum loss in reliability that is acceptable, thus the number of items to be selected is set at six. For both the 10 item and the 6 item version Cronbach’s alpha was used as a measure for the internal consistency of the scale.

To account for the hierarchical structure of the sample, variance components were estimated in a multilevel model with countries at level 1 and participants at level 2, assuming random level effects and treating the items as metric variables. Using an item response theory approach, item parameters were estimated using a two-parameter logistic model based on the graded response model [21]. In this approach, the ordered polytomous response options are mapped to a latent trait variable, resulting in a discrimination parameter (slope) and a difficulty parameter for each category bound (thresholds). The higher the discrimination parameter, the closer the relationship of the item to the construct. The difficulty parameters indicate the values on the latent trait, at which there is a 50% probability to choose the respective higher response category. Descriptive statistics for sample 1 were calculated using IBM SPSS Statistics version 20. For the estimation of the multilievel model and the item response theory model we used Mplus version 6.1 [22].

Sample 2

Sample 2 consists of data from _n_=1,460 patients with risk factors for heart failure (diabetes, hypertension, sleep apnea, or coronary heart disease), including 207 patients with previously diagnosed heart failure, who completed the GSE scale as part of the larger longitudinal DIAST-CHF study, which has been described previously [23], [24]. Briefly, in 2004 and 2005 primary care physicians referred outpatients aged 50 to 85 with the above-mentioned risk factors for comprehensive cardiological and psychometric evaluation at one of the participating centers. There were no exclusion criteria, except for unwillingness or inability (e.g., for language reasons) to participate. The study was approved by the responsible ethics committees and all participants provided their written informed consent. The sample was almost balanced by sex with 51.5% men and 48.5% women. The mean age was 66.7 years (_s_=8.0 years; range: 38 to 87 years) (_n_=5 missing data for gender and age). Several self-report questionnaires were administered at baseline and at two follow-ups, twelve (_n_=973) and 28 months (_n_=859) thereafter. For both follow-up assessments, patients were invited to come to their respective study centers for cardiological and psychometric assessments.

Instruments

The short form of the General Self Efficacy scale (GSE) that was constructed with the results of sample 1 was used with sample 2 to determine psychometric properties and to examine the construct and the prognostic validity in an independent clinical sample. The same response format as described above was used.

The depression scale of the Patient Health Questionnaire (PHQ-9) [25], [26] consists of nine items, assessing different depressive symptoms based on the diagnostic criteria for major depressive disorder in the Diagnostic and Statistical Manual Fourth Edition (DSM-IV). On a 4-point Likert-type scale ranging from “not at all” to “nearly every day”, respondents are asked to rate the degree to which each symptom applied to them over the last two weeks. Items are scored from 0 to 3 and summed up to build a summary score (range 0 to 27) with higher values signifying more severe depressive symptoms.

The SF-36 questionnaire [27], [28] assesses functional health and well-being with 36 items on eight scales, from which physical and mental health summary scores are derived. The summary scores are transformed to T-scores with mean = 50 and standard deviation = 10, with higher values implying better health.

The Hospital Anxiety and Depression Scale (HADS) [29], [30] consists of 14 items, with 7 items assessing the intensity of depressive and anxiety symptoms, respectively. Items are scored on a 4-point scale ranging from 0 to 3 and summed up to a depression and an anxiety scale (range: 0 to 21).

The Maastricht Questionnaire (MQ) [31] consists of 21 items that describe different aspects of vital exhaustion with a possible score range from 0 to 42.

The ENRICHD Social Support Inventory (ESSI) [32], [33] is a 5-item questionnaire assessing social support. The items, scored 1 to 5, are summed up for a total score, with higher scores indicating greater social support (total score range: 5 to 25).

The short form of the Freiburg Questionnaire for Coping with Illness (Freiburger Fragebogen zur Krankheitsverarbeitung, FKV-LIS) [34] consists of 35 items that assess different coping styles. The items are rated on a 5-point Likert-type scale ranging from 1 ("not relevant at all") to 5 ("very relevant"). Items are summed up to build five scales representing different coping dimensions: 1) depressive coping, 2) active problem-focused coping, 3) distraction and self-encouragement, 4) religious faith and search for meaning (5 items each), as well as 5) minimisation and wishful thinking (3 items). The possible scale score range is 5 to 25 for the first four scales and 3 to 15 for the last scale.

Analysis

After imputation of missing baseline scores using multiple imputation with the fully conditional specification approach with five imputation datasets, correlations between the GSE-6 score and the baseline scale scores of all instruments described above were calculated. In a second step, the SF-36 physical and mental health scores of the second and third measurement (after 12 and 28 months) were predicted using a linear regression model with the baseline GSE-6 score and the respective baseline health score as predictors. Finally, the relative and absolute stability of the GSE-6 score over the three measurement points was determined using Pearson correlations and a repeated measures analysis of variance. All statistical analyses for sample 2 were done using IBM SPSS Statistics version 20.

Results

Sample 1

The items of the General Self-Efficacy Scale were found to be very similar in content, thus limiting the usefulness of the first item selection criterion (selecting items that capture the content of the construct to be assessed as broadly as possible). Instead it appears more reasonable to select those items that show the highest variability in item scores, thus allowing for good discrimination among participants at different levels of the trait.

Table 1 [Tab.�1] shows the means, variances within and between clusters, the intraclass correlation coefficients, the coefficients of variation, and the corrected item-total correlation for the items as well as the discrimination and difficulty parameters estimated from the item response theory model. The intraclass correlation coefficients ranged between .09 and .26, indicating that between 9% and 26% of the total item variance exists between the countries. Item-total correlation coefficients were between .44 and .60. Item discrimination parameters were between 1.11 and 2.16, corresponding to standardized factor loadings between .52 and .77. The item difficulty parameters ranged between –3.50 and +1.05, indicating that the items are better at differentiating at lower levels of the latent trait.

The six items selected for the short form were those with the highest coefficients of variation. Among the selected items were four items with high discrimination parameters (items 5, 6, 7, and 10) and two items that are able to discriminate at the lower and especially at the upper end of the scale due to low or very high difficulty parameters (items 2 and 3).

Cronbach’s alpha for the total sample was .85 for the 10 item scale and .79 for the 6 item scale. While for the 10 item scale the alphas ranged from .75 (India) to .91 (Japan), for the 6 item scale they ranged between .64 (Syria) and .85 (Japan). The correlation between the 10 item sum score and the 6 item sum score for the total sample was .96, ranging between .93 (Portugal) and .97 (Japan) for the different subsamples.

Sample 2

In sample 2, Cronbach’s alpha for the GSE-6 scale was .86, .88, and .88 for the first, second, and third measurement, respectively. Associations of the GSE-6 scores with age were small (_r_=.03, _r_=.07, and _r_=.05, respectively) with slightly higher scores for males (Cohen’s _d_=0.05, _d_=0.11, and _d_=0.05, respectively). The correlations between the baseline GSE-6 score and the baseline scores of the other self-report instruments are shown in Table 2 [Tab.�2]. As expected, the GSE-6 shows negative associations with symptoms of depression and anxiety as well as with vital exhaustion, and positive associations with social support, mental health, and, to a lesser degree, with physical health. With regard to the coping dimensions, the GSE-6 score is positively associated with active problem-focused coping and distraction/self-encouragement and negatively associated with depressive coping and minimization/wishful thinking.

After controlling for the respective SF-36 baseline score, baseline general self-efficacy predicted the mental health summary score and the physical health summary score after 28 months. The regression weight for the mental health summary score was _b_=.49, which is significant at p<.001 (_t_=4.76; _n_=733) with a 95% confidence interval reaching from .29 to .70. The regression weight for the physical health summary score was _b_=.22, which is significant at _p_=.029 (_t_=2.18; _n_=733) with a 95% confidence interval reaching from .02 to .42. The prediction of the health summary scores after 12 months didn’t reach significance. The respective regression weights were _b_=.21 (_t_=1.81; _n_=618; _p_=.07; 95% confidence interval –.02 to .44) for mental health and _b_=.18 [_t_=1.62; _n_=618; _p_=.11; 95% confidence interval –.04 to .41) for physical health.

The relative stability of the self-efficacy score, calculated as a Pearson correlation between the baseline score and the respective follow-up score, over twelve months (_n_=708) and 28 months (_n_=833) was _r_=.50 and _r_=.60, respectively. The mean self-efficacy score did not change over time (_F_=0.03; _p_=.97; _n_=558).

Discussion

We constructed a six item short form of the General Self-Efficacy scale that showed acceptable psychometric properties in different cultures and in non-clinical and clinical samples and good concurrent and predictive validity in a clinical sample of cardiac patients. The internal consistency for the 6 item scale (range alpha=.79 to .88) was only slightly smaller than the value generally observed for the original scale (range alpha=.75 to .94). The retest-reliability (.50 to .60) was in the range found for the original scale (.47 to .75). The strength of the associations with measures of well-being (range .11 to .45) and with coping-related measures (range .04 to .34) are comparable to the effect sizes of r=.28 and r=.28 found in a meta-analysis of six different clinical and non-clinical samples [14].

In comparison with the original ten item scale, the short form can lead to time and resource savings of 40%. On the other hand a reduction in length generally leads to a loss of reliability and validity. These losses are to be traded against the benefits in the context of the intended use of the instrument [17]. Especially in large multivariate studies or for screening purposes the short form may be useful.

There are some limitations to our study. Ideally, the overlap of the short and full form should have been shown using independent administrations, because scoring the long and short form from one administration leads to overestimation of the correlation [17]. In future research, the reliability and validity of the GSE-6 should be tested in other languages and countries as well as using other clinical and non-clinical samples.

Conclusions

The six item short form of the GSE scale is a reliable and valid instrument that may be useful for the economical assessment of general self-efficacy in large multivariate studies and for screening purposes.

Notes

Competing interests

The authors declare that they have no competing interests.

Authorship

The authors Matthias Romppel and Christoph Hermann-Lingen contributed equally to this work.

GMS | GMS Psycho-Social-Medicine | A short form of the General Self-Efficacy Scale (GSE-6): Development, psychometric properties and validity in an intercultural non-clinical sample and a sample of patients at risk for heart failure (original) (raw)

Abstract

Zusammenfassung

Introduction

Methods

Sample 1

Instrument

Analysis

Sample 2

Instruments

Analysis

Results

Sample 1

Sample 2

Discussion

Conclusions

Notes

Competing interests

Authorship