Psychometric Assessment and evaluation Research Papers (original) (raw)

- by
- •
- Employee engagement, Reliability, Validity, Psychometric Assessment and evaluation

O estudo encaminhou o teste empírico da hipótese de mediação cognitiva de autoconsciência através de imagens mentais e das relações da mediação icônica com habilidades de visualização, bem como de exame das qualidades psicométricas da Escala de Autoconsciência Situacional e o Teste de Habilidades de Visualização de Imagens Mentais, com suas duas séries independentes -Self e Não-Self, visando seus usos em pesquisas futuras. O estudo ex-post-facto completo foi conduzido com 958 universitários, os quais responderam os instrumentos de forma individual ou coletiva, sendo os dados analisados através de procedimentos de Análise Fatorial, coefi ciente de correlação de Pearson, técnica de Regressão Linear e Análises Multidimensionais tipo SSA. As análises efetuadas corroboraram a hipótese de mediação cognitiva de autoconsciência por imagens mentais, e de que a mediação icônica tem uma relação consistente com o nível de desenvolvimento das habilidades imaginativas. Palavras-chave: Autoconsciência situacional, imagens mentais, mediação cognitiva, habilidades de visualização, avaliação psicométrica.

Psychometrics assessments for personality are tools that organisations use to discover the personality of potential hire and make determinations as to whether they will fit into the organisation and the job. Despite the pedigree of psychometric assessments, such as the big five inventory assessment, questions are being asked about the reliability of the results. The reasons for the scepticism are reproducibility of results and social desirability bias. The implementation of machine learning technologies can improve the reliability of the results by learning from historical datasets to make more accurate predictions on how well a candidate based on personality will fit into an organisation. This paper explores the current implementation of psychometric tests and suggests integrating machine learning into the workflow. The author finally does a SWOT analysis of the potential implementation.

- by Sreenidhi S K and +1
- •
- Psychology, Personality Psychology, Positive Psychology, Psychometrics

Apathy is a neuropsychiatric symptom observed in different neurological and psychiatric
disorders. Although apathy is considered a symptom, it has been recently reconsidered as a syndrome
characterised by three dimensions: cognitive symptoms, affective symptoms and behavioural
symptoms. Recent studies have shown that apathy can be considered as a prodromal symptom of
Alzheimer’s disease (AD), but also an indicator of the transition from mild cognitive impairment
to AD. According to this scenario, an early detection of apathy in subjects with Mild Cognitive
Impairment (MCI) and Mild AD can be a valid psychometric strategy to improve an early diagnosis
and promote a prompt intervention. The Apathy Evaluation Scale is a validated tool composed
of 18 items that assess and quantify emotional, behavioural and cognitive aspects of apathy. The
aim of this study is to assess the specific reliability and validity of the Italian version of the Apathy
Evaluation Scale—Clinician Version (AES-C) to detect apathy both in amnestic MCI and mild AD
patients. In the present paper, we therefore examined the psychometric properties and the invariance
of the Italian Version of the AES-C conducted on a sample composed of an experimental group of
amnestic MCI and AD patients (N = 107) and a control group (N = 107) constituted by Age- and
Sex-matched healthy controls. Results confirm the goodness of the scale. Confirmatory factory analysis
confirmed that the AES-C Italian Version presents the same stability of one second-order factor
and three first-order factors identified in the original version, and all items are predicted by a single
general factor. Moreover, the scale was found to be invariant across both populations. Moreover,
reliability and discriminant analysis showed good values. We found in the experimental group a
negative correlation between the AES-C and Frontal Assessment Battery (FAB) (rs = 􀀀0.21, p < 0.001)
and Mini Mental State Examination (MMSE) (rs = 􀀀0.04, p < 0.001), while a positive correlation was
found between the AES-C and Hamilton psychiatric Rating scale for Depression (HAM-D) scores
(rs = 0.58, p < 0.001) Overall, our data demonstrated the validity of the Italian version of the AES-C
for the assessment of apathy both in MCI and in AD patients.

Purpose – The purpose of this paper is to measure reliability, validity and accuracy of the 12-item General Health Questionnaire (GHQ-12) as a measure of emotional wellbeing in pregnant women; utility and threshold
in particular.
Design/methodology/approach – The authors measured self-reported emotional wellbeing responses of 164 low-risk pregnant Dutch women with the GHQ-12 and a dichotomous case-finding item (Gold standard). The authors established internal consistency of the 12 GHQ-items (Cronbach’s coefficient α); construct validity: factor analysis using Oblimin rotation; convergent validity (Pearson’s correlation) and discriminatory ability (area under the receiver operating characteristics curve and index of union); and external validity of the dichotomous criterion standard against the GHQ-12 responses (sensitivity, specificity, likelihood ratios and predictive values), applying a cut-off value of ⩾ 12 and ⩾ 17, respectively.
Findings – A coefficient of 0.85 showed construct reliability. The GHQ-12 items in the pattern matrix showed a three-dimensional factorial model: factor 1, anxiety and depression; factor 2, coping; and factor 3, significance/effect on life, with a total variance of 59 per cent. The GHQ-12 showed good accuracy (0.84; p <0.001) and external validity (r=0.57; p <0.001) when the cut-off value was set at the ⩾ 17 value. Using a cut-off value of ⩾ 17 demonstrated higher sensitivity (72.32 vs 41.07 per cent) but lower specificity (32.69 vs 55.77 per cent) compared to the commonly used cut-off value of ⩾ 12.
Research limitations/implications – Findings generally support the reliability, validity and accuracy of the Dutch version of the GHQ-12. Further evaluation of the measure, at more than one timepoint during pregnancy, is recommended.
Practical implications – The GHQ-12 holds the potential to measure antenatal emotional wellbeing and women’s emotional responses and coping mechanisms with reduced antenatal emotional wellbeing.
Social implications – Adapting the GHQ-12 cut-off value enables effective identification of reduced emotional wellbeing to provide adequate care and allows potential reduction of anxiety among healthy pregnant women who are incorrectly screened as positive.
Originality/value – A novel aspect is adapting the threshold of the GHQ-12 to ⩾ 17 in antenatal care.

- by Yvonne Kuipers
- •
- Psychometric Assessment and evaluation
- by Margaret Price
- •
- Marketing, Psychology, Teaching and Learning, Assessment

RESUMEN La solución de problemas es una estrategia necesaria para la adaptación funcional del individuo en diferentes circunstancias de la vida, ya sea ante eventos estresantes agudos o crónicos. El objetivo de este trabajo fue determinar las propiedades psicométricas de la versión corta del Inventario de Resolución de Problemas Sociales-Revisado en una muestra por disponibilidad de universitarios mexicanos. Participaron 330 estudiantes de una universidad pública y una privada, de los cuales 283 fueron mujeres y 92 hombres, con una edad promedio de 22 años. Los participantes contesta-ron, además del SPSI-R, el Inventario de Solución de Problemas de Heppner y Petersen. El análisis factorial identificó cuatro factores que conforman un instrumento con 25 ítems, con una estructura similar a la versión original. La consistencia interna de la escala global mostró un índice satisfac-torio. Los coeficientes alfa de Cronbach de las subescalas tuvieron un valor que explica 50.22% de la varianza. La validez concurrente mostró resultados significativos. Los autores concluyen que el SPSI-R en una muestra de población mexicana mostró características psicométricas adecuadas, obteniéndose cuatro factores. La relevancia de los resultados obtenidos radica en que se trata de un instrumento que determina diferentes estrategias para resolver eventos estresantes. Palabras clave: Ansioso e impulsivo/descuidado y evitativo-depresivo; Resolución ra-cional de problemas; Población mexicana; Propiedades psicométricas. ABSTRACT Problem solving is a necessary strategy for the functional adaptation of the individual in widely diverse circumstances of life, for either acute or chronic stressful events. The objective of the present study was to determine the psychometric properties of the short version of the Social Problem Solving Inventory-Revised, SPSI-R in a by-availability sample of Mexican university students. A total of 330 students from a public and a private university participated, including 283 females and 92 males with an average age of 22.12 years. Participants answered both the SPSI-R Inventory and the Troubleshooting Inventory of Heppner and Petersen's. Factor analysis identified four factors and an additional indicator. Items ended up composing an instrument of 26 items, similar to those of

The psychometric rigor of Emotional Regulation (ER) scales used in the Spanishspeaking population was reviewed. Electronic databases were analyzed, locating articles from specialized journals in Latin America and Spain that reported the construction or adaptation of ER instruments in Spanish. Later, its purpose, definition of ER and main characteristics were described; finally, their psychometric quality was evaluated. Of 24 scales, seven were constructed in Spanish and 17 adapted from 10 originals in English, aimed at children, adolescents, and adults, with an average of 33 items and five dimensions, applied to an average of 320 participants. The main strategies of emotional regulation were the cognitive ones. Construct validity resulted the psychometric criterion more reported. Quality scores were higher for original scales in English. The best scores were for the adaptations of the DERS, ERQ and CERQ scales, and the ERQ in English.

This study assesses the psychometric properties of the Malaysian National Philosophy of Education (NPE) Scale. The study argues on the grounds of literature obtained that since the emergence of NPE no empirical validation of the scale was performed. As such, the study sampled 230 participants in secondary schools in Kuching, Malaysia to develop and validate the scale. The results demonstrate that the NPE comprises eight distinct factors. The fit statistics of the NPE eight-factor model demonstrate that the model fitted the data. This quantitative scale is deemed as the first of its kind to device the NPE instrument quantitatively, assess the psychometric properties, and establish evidence for composite reliability, construct, convergent, and discriminant validities of the instrument across selected secondary schools. Theoretically, the NPE quantitative measures initiated and contributed a new body of knowledge to the NPE in the context of Malaysia other than the conceptual dimensions proposed by Ministry of Education. Practically, the results provided school principals and the Ministry of Education with appropriate tools to assess the extent to which school teachers translate and infuse the ideas of the NPE in the day-to-day lesson delivery.

- by Shafeeq Hussain and +1
- •
- Psychometric Assessment and evaluation, National Philosophy of Education, Malaysia
- by Maheswar Satpathy
- •
- India, Psychological Testing, Children, Psychometric Assessment and evaluation

The purpose of this study is to develop the original Enneagram Turkey Personality Inventory and the validity and reliability studies in Turkey sample. The inventory was applied to 1110 people in 723 women (65.13%) and 387 men (34.86%) aged between 17-62 (23.23 ± 6.51) in 3 stages. Enneagram Turkey Inventory, for each subscale index profile is an inventory such as MMPI is subjected to principal components analysis in accordance with its varimax rotation. As a result of the factor analysis, the final form of the inventory with 54 questions in 9 sub-factors was developed. In the development process of the scale, Lawshe analysis, exploratory factor analysis, reliability analysis, test-retest and convergent validity studies, and confirmatory factor analysis (CFA) studies were conducted and reported. In reliability analysis, Cronbach alpha value for total scale was found to be between 0.89 and 0.60-0.78 for subscales; Item-total correlations were found above 0.30 for each subscale. Correlation coefficients in test-retest analysis are between 0.20-0.57. In addition, in the confirmatory factor analysis (DFA) conducted as a result of the last study of the scale, the fit indices of the model tested were found to be appropriate [(χ2 (2545,804, sd = 1027, p = .000); χ2 / sd = 2.47; RMSEA = 0.06; SRM = 0.07; GFI = 0.97; AGFI = 0.94; CFI = 0.92; IFI = 0.92; NFI = 0.91; RFI = 0.87]. the applications result in accordance with the obtained validity and reliability analysis findings Enneagram Turkey Personality inventory (ETPI) is understood to be a valid and reliable scale. Structured Abstract: Introduction Enneagram, done a lot of work on in the world, is a personality typology system also increasing awareness gradually Turkey. It is a personality theory that basically diverges from each other but has a dynamic relationship between the types and consists of nine different thinking, feeling and acting patterns (Daniels and Price, 2004, p. 13). Enneagram word Greek "nine", meaning " ennea " and "spots" in the sense of " Grammos " has emerged with a combination of words. The " Enneagram Scheme" symbolized by a nine-pointed star shows the development of an event from its initial moment at all stages in the world (Palmer, 2014: 25). Gurdjieff says that the Enneagram is built on three centers: thought, emotion and movement (Ouspensky, 2010, p. 101).

- by Turgay ŞİRİN
- •
- Psychology, Personality Psychology, Psychometrics, Personality

Psychology has many applications including psychological testing. In this paper, an attempt is made to identify the types of tests used in Ghana and discuss the problems associated with the current state of psychological testing in Ghana. It was concluded that the current state of psychological testing has been too Eurocentric and Westernized. As a result, it limits the applicability and usefulness of the tests in the Ghanaian setting. After this critical evaluation, suggestions were then made for the improving psychological testing in terms of construction of Ghana-centric tests and validation of imported tests. Though this paper focuses on Ghana, it is expected that the discussions and recommendations would equally be relevant for other non-European and non-American populations of world.

- by Seth Oppong, PhD
- •
- Psychology, Applied Psychology, Clinical Psychology, Psychological Assessment

Abstract
In this paper we present a new instrument called Social Skills Questionnaire for Argentinean College Students (SSQ-U). Based on the adapted version of the Social Skills Inventory - Del Prette (SSI-Del Prette) (Olaz, Medrano, Greco, & Del Prette, 2009), we wrote new items for the scale, and carried out psychometric analysis to assess the validity and reliability of the instrument. In the first study, we collected evidence based on test content through expert judges who evaluated the quality and the relevance of the items. In the second and third studies, we provided validity evidence based on the internal structure of the instrument using exploratory (n = 1067) and confirmatory (n = 661) factor analysis. Results suggested a five-factor structure consistent with the dimensions of social skills, as proposed by Kelly (2002). The fit indexes corresponding to the obtained model were adequate, and composite reliability coefficients of each factor were excellent (above .75). Finally, in the fourth study, we provided evidence of convergent and discriminant validity. The obtained results allow us to conclude that the SSQ-U is the first valid and reliable instrument for measuring social skills in Argentinean college students.

- by Zilda Del Prette and +1
- •
- Psychology, Psychometry, Assessment, Psychometrics

Twenty first century students need twenty first century methods of teaching inside or around twenty first century classrooms. In view of this fact, pre-service and in-service teachers need to be kept abreast of certain facts around... more

Ce document ne fait pas l'objet d'une publication scientifique. Mais il permet juste d'avoir une connaissance sur la psychométrie notamment dans le processus de validation des propriétés métrologiques d'une échelle de mesure. C'est un... more

- by Una Daly
- •
- Psychology, Social Psychology, HRM & Organisational Behaviour, Interviewing

r e s u M e n El estudio tiene como objetivo obtener información para población cubana acerca de la validez predictiva de la EADG para detectar personas con tras-tornos psicopatológicos, así como para diferenciar ansiedad y depresión. Se trabajó con una muestra no probabilística, integrada por 548 sujetos, de los cuales el 31,2% tenían un trastorno psicopatológico diagnosticado por un psiquiatra o psicólogo. Para el análisis de los datos se utilizó la metodología de análisis de las curvas ROC. Se encontró que la EADG mostró un valor predictivo adecuado para identificar a personas con trastornos psicopato-lógicos, con escasa capacidad para distinguir trastornos de ansiedad y de-presión. Estos resultados apoyan la estrategia evaluativa recomendada por los autores de aplicar primeramente los ítems de despistaje. a b s t r a C t The study has as objective to obtain information for Cuban population about the predictive validity of the EADG for detecting people with psycho-pathologic dysfunctions, as well as to differentiate anxiety and depression. It worked with a non probabilistic sample, integrated by 548 subjects, of which 31.2% had a psychopathologic disease diagnosed by a psychiatrist or psychologist. For the analysis of the data, the methodology analysis of the curves ROC was used. It was found that the EADG showed an appropriate predictive value to identify people with psychopathologic dysfunctions, and difficulties to distinguish dysfunctions of anxiety and depression. Results support the evaluative strategy recommended by the authors of applying the screening items firstly.

En los últimos años la cantidad de trabajos publicados y de revistas especializadas que se interesa por los fenómenos espirituales y religiosos ha crecido considerablemente conformando un área denominada psicología de la religión. Sin embargo, uno de los principales obstáculos en el campo ha sido la evaluación psicométrica de los constructos numinosos. El propósito del presente trabajo consiste en realizar una revisión de la literatura de las escalas adaptadas y validadas en Argentina en las bases PsycInfo, ERIC, Pubmed, CAIRN, CLASE, Scielo, Dialnet, Lilacs y Redalyc. De acuerdo con los antecedentes, se han identificado seis escalas adaptadas al contexto local. A pesar de que los instrumentos han presentado propiedades psicométricas aceptables, se han identificado numerosas dificultades en el proceso de adaptación de las técnicas que han implicado la eliminación de ítems. Se sugiere ampliar la cantidad de estudios que puedan contribuir fortalecer tales técnicas, a la vez que a adaptar y validar otras escalas de evaluación psicológica

- by Hugo Simkin
- •
- Religion, Psychology, Psychological Assessment, Psychometrics

Every examiner or individual using test scores should be confident that the scores obtained by a test taker is a true indication of that person's level of knowledge or ability on the construct of interest with a proper guide against factors that can lead to score invalidity. However, this study determined the effect of test item compromise and test item practice on Economics Achievement Test Scores among secondary school students in Cross River State. It also examined whether test item compromise and test item practice affected the validity of test scores obtained in the Economics Achievement Test among secondary school students in the state. A quasiexperimental research design was adopted for the study. The population of the study consisted of all secondary schools in the 18 Local Government Areas (LGA) of Cross River State. The sample consisted of 90 SS2 Economics students randomly selected in the three Secondary Schools used for the study which was carried out in an intact classroom. There were three groups and 30 respondents were randomly assigned to each group. The three groups were compromise group (E1), practice group (E2) and control group (C1). The instrument used for the study was an Economics Achievement Test (EAT) developed by Shogbesan (2017) which consisted of 25 items of various formats with a liability index of 0.68. The EAT was administered to the three groups with 13 items exposed as treatments to E1 for the students to be familiar with some of the test items a few minutes before the test, while the practice group was given the 13 test items to practice with the help of the researchers and research assistants who are Economics teachers in the school. The results indicated among others that students' scores were inflated on compromised and practiced test items which contributed to the score invalidity. It was recommended among others that security of test items should be considered vital before and during test administration process.

This article describes the steps and phases involved in constructing a questionnaire on the motivation to learn a second or foreign language (or L2 motivation). It evaluates psychometric properties of the instrument by performing exploratory and confirmatory factor analyses. Participants in this study were 194 students. The results of the exploratory factor analysis indicated that four dimensions formed the construct of L2 motivation, namely Instrumental orientation, Integrative orientation, Commitment and Effort. The findings from a subsequent confirmatory factor analysis validated the four-dimensional structure. The conclusions reached by this study as well as the steps and phases involved in the development of research instrument on L2 motivation could be informative for educators interested in issues related to language learning motivation and useful for future scholarly investigations of L2 motivation.

- by Larisa Nikitina and +1
- •
- Teaching of Foreign Languages, Russian Language, Psychometric Assessment and evaluation, Instrument Development
- by Masha Tkatchouk and +1
- •
- Test Validity, Psychometric Assessment and evaluation, Rating Scales, Questionnaires

14 Psychological Tests Developed for Children in India: A Review of Recent Trends in Research, Practice and Application Maheshwar Satapathy The ... Academic anxiety scale for children is another test developed for the same function by AK... more

- by Maheswar Satpathy
- •
- India, Clinical Child Psychology, Psychological Testing, Children

Resumo Este estudo verificou a validade de construto do Sistema de Avaliação das Habilidades Sociais (SSRS-BR) para crianças com deficiência intelectual, com base nas relações entre os construtos habilidades sociais, problemas de comportamento e competência acadêmica. Participaram 84 crianças com deficiência intelectual e seus professores, que responderam ao SSRS-BR. Foram encontradas correlações positivas e significativas entre as escalas globais e todas as subescalas dos instrumentos de autoavaliação e avaliação por professores. Além disso, foram encontradas correlações negativas e significativas entre as escalas de habilidades sociais e problemas de comportamento e positivas entre as escalas de competência acadêmica e habilidades sociais. Os resultados corroboram as relações encontradas na literatura entre os três constructos considerados e atesta a viabilidade de aplicação do SSRS-BR a crianças com deficiência intelectual.
Abstract This paper examined the construct validity of the Social Skills Rating System-Brazilian version (SSRS-BR) for children with mental disabilities, based on the relationships between social skills, behavior problems and academic competence. Participants were 84 children with mental disabilities and their teachers, who responded the SSRS-BR. Positive correlations were found between the global scales and all subscales for the instruments of self-assessment and assessment by teachers. Furthermore, negative and significant correlations were found between social skills and behavior problems scales and positive correlations between academic competence and social skills scales. The results supported the relationships found in the literature between the three constructs and the possibility to apply the SSRS-BR for children with mental disabilities.

- by Zilda Del Prette
- •
- Psychometry, Education, Psychometrics, Emotional/Behavioral Disorders

Computerized adaptive tests (CATs) have received wider attention throughout the world in recent years mainly because they provide their users with higher precision in examinee ability estimation with less Items by the help of the sound psychometric principals they are based on. The recent attention necessitates the need for platforms on which CATs can be developed, piloted and administered flawlessly. In this article, FastTest, a platform specifically designed to facilitate the development and delivery of CATs, will be reviewed. This review article begins by describing the functionality available in FastTest, with a focus on adaptive testing functionality and related psychometrics. Then, two examples of FastTest being used to develop and deliver CATs–which demonstrate that CAT is feasible for many organizations–will be discussed.

- by Hector Hurtado Grooscors and +2
- •
- Psychometrics, Psychometric Assessment and evaluation, Psychometrics and Test Development

In this paper, we discuss the benefits of and how to utilize Bayesian statistics in studies of moral education. To demonstrate concrete examples of the applications of Bayesian statistics to studies of moral education, we reanalyzed two datasets previously collected: one small dataset collected from a moral educational intervention experiment, and one big dataset from a large-scale Defining Issues Test-2 survey. Results suggest that Bayesian analysis of datasets collected from moral educational studies can provide additional useful statistical information, particularly that associated with the strength of evidence supporting alternative hypotheses, which has not been provided by the classical frequentist approach focusing on P-values. Finally, we introduce several practical guidelines pertaining to how to utilize Bayesian statistics, including the utilization of newly developed free statistical software, Jeffrey’s Amazing Statistics Program (JASP), and thresholding based on Bayes Factors, to scholars in the field of moral education.

- by Joonsuk Park and +1
- •
- Psychology, Psychometry, Quantitative Psychology, Social Psychology
- by Susanmarie Harrington
- •
- Marketing, Psychology, Grading, World Wide Web

The goal of the present study is to estimate the psychometric properties of the Sensitivity to Punishment and Sensitivity to Reward Questionnaire (SPSRQ; Torrubia, Ávila, ) in a sample of Chilean college students. The main hypothesis is that the instrument would show appropriate levels of reliability and validity, in light of previous validation studies. A pilot study was conducted in order to generate the adapted version of the questionnaire, which was then applied to a student sample from different undergraduate careers (n = 434). The results show the expected levels of reliability (test-retest and internal consistency). The factorial validity does not comply with the expected model, suggesting a further consideration of the structure of the questionnaire. External validity is appropriate, as the questionnaire shows the expected correlations with other personality measures. It is concluded that the SPSRQ is adequate for the context of validation, and this study contributes to the generalization of the questionnaire, since the results are consistent with the expected psychometric properties that have been reported in the literature.

This study sought to establish the construct validity for an instrument for measuring anxiety. The researchers used a four-point questionnaire and a seven-point Osgood semantic differential scale on depression to ascertain the convergent validity while two instruments measuring aggression were employed to establish the divergent trait with anxiety using multitrait-multimethod matrix. The face validity was carried out by experts in Educational Measurement and Evaluation. Cronbach Alpha reliability estimates for internal consistency of the items yielded 0.76, 0.98 for anxiety measures; 0.74, 0.85 for depression measures and 0.63, 0.79 for aggression measures respectively. The PPMC coefficient was used to test the hypotheses. Samples of thirty Senior Secondary III students of University of Nigeria Demonstration Secondary School were purposively selected for the study. The results demonstrated moderate convergence (r = 0.20, 0.49, 0.39 for measures of anxiety, depression and aggression respectively) between two different methods of the same trait. Measures assessing anxiety and depression could be distinguished from measures assessing aggression. In conclusion the rejection of the first hypothesis and the retention of the second and third hypotheses based on the correlation confirm the convergent and divergent validities of the instruments; therefore, the instruments for measuring anxiety were deemed valid and reliable.

The aim of this study was to describe the process of psychometric analysis of the Teacher Observation of Classroom Adaptation-Revised Scale (TOCA-R) for its use in Brazilian schools and to evaluate its validity and reliability. To evaluate the "Elos Program", which is the Brazilian culturally adapted version of the North American Program "Good Behavior Game", the TOCA-R was used. The researchers adapted the instrument in 2014, consisting of 33 items in a three-point ordinal response scale. A longitudinal quasi-experimental design with a single group was used. Participants were children aged 6 to 10 years evaluated by their teachers, before (n = 1448) and after (n = 673) the implementation of the Elos Program in 2014. The study involved initially four schools, 68 classes and their respective teachers. The analytical procedures were exploratory factorial analysis, confirmatory factorial analysis, longitudinal invariance analysis and reliability analysis by precision coefficients. The results of the exploratory factorial analysis showed an acceptable adjustment of five factors with 25 items, with a total explained variance of 60% and mean residual error of 0.02. The confirmatory factorial analysis expressed a satisfactory fit of the model (χ2 = 961, df = 265, RMSEA = .078, 95% IC [.07, .08], and CFI = 0.9). A configu-rational, metric and scalar invariance of latent structure was identified, which, together with the amplitude of variation of the precision coefficients between the instrument dimensions (α = .78, .92; ω = .76, .92), demonstrate evidence of validity and reliability for using the TOCA-R in evaluating the Elos Program in Brazilian schools.

- by Daniela Ribeiro Schneider and +1
- •
- Psychometry, Psychometrics, Child and adolescent mental health, Mental Health

The main aim of this study is to present a set of IRT-based operational solutions aimed at dealing with specific measurement issues emerging from a national-level large-scale program for the assessment of student achievement. In particular, we focused on specific problems related to measurement invariance and test dimensionality. Analyses were performed on data from the Italian 8th-grade math examinations implemented by INVALSI. Results indicated only negligible differential item functioning in the examined tests, while minor indications of multidimensionality emerged. The chosen analytical approach represents a practical and economical solution for the validation of problematic large-scale response data.

- by Davide Marengo
- •
- Psychometric Assessment and evaluation

Günlük konuşmalarda dolaylı ya da doğrudan “zeka, zeka düzeyi, zeki vb…” terimleri kullanılmakta, insanları tanımlarken zeka düzeyi referans alınmakta, sevdiğimiz, beğendiğimiz kişileri “zeki, akıllı”; sevmediğimiz kişileri ise “aklı başında değil ya da zekası kıt” olarak sınıflandırmaktayız. Bu sınıflandırmayı yaparken neyi ölçüt alıyoruz ? Tüm bu tanımlamalar zekanın değişken olduğunu, sayısallaştırılabileceğini ve yanlışsız ölçülebileceğini varsaymaktadır. Francis Galton’dan Binet, Termana’a kadar zekanın ölçülebilmesini sağlayacak bir çok ölçüt geliştirilmiştir. IQ (Zeka Bölümü) kavramını günlük yaşantımızın ayrılmaz bir parçası kılan bu ölçeklerin standardize edilmiş versiyonları günümüzde de yaygın olarak kullanılmaya devam ediyor. Ancak IQ testlerinin zekayı ölçüp ölçmediği hala tartışılmaktadır.
Bu sözlü bildiri, zeka, IQ kavramı ve zeka testlerinin neyi ölçtüğü üzerine süren tartışmalara küçük de olsa katkı sunmayı amaçlamaktadır.

- by Masha Tkatchouk and +1
- •
- Test Validity, Psychometric Assessment and evaluation, Rating Scales, Questionnaires

Training programmes are evaluated to verify their effectiveness, assess their ability to achieve their goals and identify the areas that require improvement. Therefore, the target of evaluators is to develop an appropriate framework for evaluating training programmes. This study adapted Kirkpatrick’s four-level model of training criteria published in 1959 to evaluate training programmes for head teachers according to their own perceptions and those of their supervisors. The adapted model may help evaluators to conceptualise the assessment of learning outcomes of training programmes with metrics and instruments. The model also helps to determine the strengths and weaknesses of the training process. The adaptation includes concrete metrics and instruments for each of the four levels in the model: reaction criteria, learning criteria, behaviour criteria and results criteria. The adapted model was applied to evaluate 12 training programmes for female head teachers in Saudi Arabia. The s...

Instrumentos de avaliação de ansiedade e depressão são úteis para diagnóstico e orientação do manejo clínico diante das alterações emocionais suscitadas pelas vivências do câncer. O presente estudo comparou vantagens e desvantagens psicométricas de instrumentos comumente utilizados em serviços especializados em Oncologia: Escala de Ansiedade e Depressão (HADS), Transtorno Geral de Ansiedade (GAD-7) e Questionário sobre Saúde do Paciente (PHQ-9). Participaram da pesquisa, 200 pacientes diagnosticados com câncer, em tratamento quimioterápico, sendo 30,5% homens e 69,5% mulheres, com idade entre 18 a 89 anos (M=56,8; DP=15). Os instrumentos mostraram coeficientes de fidedignidade variando entre 0,74 e 0,84. As características psicométricas estudadas indicaram valores melhores para HADS-D e GAD-7. Entretanto, HADS-A e PHQ-9 também se mostraram adequados para avaliação de ansiedade e depressão. Sugere-se a adoção desses instrumentos para triagem, diagnóstico e monitoramento de pacientes com câncer, especialmente nos domínios psicológico e social. Palavras-chave: Câncer; Ansiedade; Depressão; Psicometria; Medidas estatísticas.

- by Jacob A. Laros
- •
- Psychometrics, Depression, Cancer, Anxiety

Using data for 426 instructors at the University of Maine, we examined the relationship between RateMyProfessors.com (RMP) indices and formal in-class student evaluations of teaching (SET). The two primary RMP indices correlate substantively and significantly with their respective SET items: RMP overall quality correlates r = .68 with SET item, Overall, how would you rate the instructor?; and RMP ease correlates r = .44 with SET item, How did the work load for this course compare to that of others of equal credit? Further, RMP overall quality and RMP ease each correlates with its corresponding SET factor derived from a principal components analysis of all 29 SET items: r = .57 and .51, respectively. While these RMP/SET correlations should give pause to those who are inclined to dismiss RMP indices as meaningless, the amount of variance left unexplained in SET criteria limits the utility of RMP. The ultimate implication of our results, we believe, is that higher education institutions should make their SET data publicly available online.

- by Theodore Coladarci
- •
- Psychometric Assessment and evaluation

ABSTRACT. This study aimed to verify the Factorial Structure of the Brazilian version of the Social Skills Rating System (SSRS-BR), with an extended sample of participants, and their subsequent Confirmatory Factor Analyses in another subset of data. The analyses were based on a total sample of 942 evaluations of children between six and thirteen years of age, 817 evaluations of teachers and 562 evaluations of parents, residents of four Brazilian states. The exploratory factor analysis performed on half of the data, indicated a five-factor structure for the social skills scale for parents, and a four factor structure for the scales of the teachers and students. For the behavior problems scales, three factors were found in the instrument for parents and two factors in the instrument for the teacher. Confirmatory factor analysis showed satisfactory indices for the three instruments, after removing some items and performing some re-specifications. Keywords: social skills, behavior problems, academic competence, evaluation scale, factor analysis.

- by Zilda Del Prette
- •
- Psychology, Psychological Assessment, Psychometry, Education

It is well known that software development projects tend to be based on over-optimistic cost estimates. Better knowledge about software cost estimation is necessary to improve realism in software development project bids and budgets. In my master thesis, I did a literature review that indicates that many research papers address software cost and effort estimation, but none of the 150 papers I reviewed addressed the software test effort and/or cost estimation. We therefore prepared a set of five research questions to address software test effort estimation, and conducted a case study and collected empirical evidence from software development companies in Nepal. The minimum company size was 30 while the maximum company size was 200. I performed the case study by conducting interviews with a set of structured questionnaires. I compared the results obtained from the case study with the literature review and found that there exists practice for empirical evidence based verification, validation, and testing cost/effort estimations. I also noted that test effort estimation follow the same pattern as software development project estimates. My results show that 1) all the companies prepare separate estimates for test effort, 2) empirical data is commonly used to estimate test effort, and 3) test effort estimation error seems to be closely correlated with development effort estimation error. A company that had estimated total of 3500 man-months had actually spent 4200 man-months implying 700 man-months of effort/cost overruns to complete the project. Another company that projected testing effort of 100 man-hour actually ended up in 120 man-hour at the end of project causing 20 man-hour effort/cost overruns. Therefore, our study indicates that test effort closely follows the development patterns. However, more studies in this area are clearly needed.

- by Lava Kafle
- •
- Psychometric Assessment and evaluation
- by Luca P Ardigò
- •
- Psychology, Sport Psychology, Psychometrics, Medicine

- by Tiziana Maci
- •
- Mild Cognitive Impairment, Multidisciplinary, Environmental public health, Apathy

Background: Despite studies in developed countries repeatedly reporting on the positive influence of resilience on the ability of family caregivers to withstand the burden of providing care for their relatives no literature is currently available regarding the construct and the factors associated with resilience among the family caregivers of Nigerian psychiatric patients.

- by olutayo aloba and +1
- •
- Psychometric Assessment and evaluation

Background & Aims: Sexual satisfaction has been considered as one of the basic physiological needs, with significant impact on the health of individuals and society. In order to understand this concept better and dealing with crises and issues arising from it, the development of a specific questionnaire for measuring sexual satisfaction among Iranian couples is required. The present study aimed to assess the validity and reliability of the Persian Version of Index of Sexual Satisfaction in couples in 2013.
Methods: In this methodological study, 150 Iranian couples living in Qazvin completed the 25- item Larson's sexual satisfaction questionnaire. Reliability was determined by the calculation of Cronbach's alpha coefficient and intra-class correlation coefficients. Exploratory and confirmatory factor analysis was done by SPSS-AMOS22.
Results: Cronbach's alpha values for all positives and negative items were above 0.70. By using exploratory principal components analysis, with Varimax orthogonal rotation and an eigenvalue cut-off of 1.0, three factors were produced that explained more than 42.73% of the data. Confirmatory factor analysis confirmed the final factor construct of Larson sexual Satisfaction questionnaire.
Conclusion: Persian version of Larson sexual Satisfaction questionnaire has suitable validity and reliability to be used among the Iranian couples. The factor analysis demonstrated that Larson sexual Satisfaction questionnaire has a multi-dimensional structure. With consideration of the proper psychometric characteristics, this questionnaire can be used to measure sexual satisfaction in this population.
Keywords: Larson Sexual Satisfaction questionnaire, Validity, Reliability

The psychometric properties of Davis ' (1980) Interpersonal Reactivity Index (IRI) in Chile were assessed. The IRI was applied to a sample of 435 college students. Appropriate internal consistencies and test-retest stability resulted. The instrument's validity was evidenced by the interrelations among the scales, in addition to its correlations in the predicted direction to other related psychological constructs, and sex differences emerged in three of its dimensions. A confirmatory factor analysis corroborated the theoretical structure of the IRI in Chile, and the suitability of both the four-factor model and a second order factor that integrates three of the dimensions . The implications and comparison of the results with other adaptations of the IRI are discussed.