Rasch Analysis Research Papers - Academia.edu (original) (raw)

2025, Journal of Science Education, ESUT

The poor state of secondary school students' achievement relative to policy expectation in Physics triggered the study. At the level of instrument quality, the scoring pattern in West African Senior School Certificate Examination's... more

The poor state of secondary school students' achievement relative to policy expectation in Physics triggered the study. At the level of instrument quality, the scoring pattern in West African Senior School Certificate Examination's (WASSCE) Physics essay marking scheme, difficulty-parameter level of the items and differential item functioning were sought. Instrumentation research design was adopted. One hundred and sixty senior secondary III Physics students consisting of 63 female and 97 male students formed the sample. The 2018 WASSCE Physics essay with its marking scheme formed the instrument used to collect data. The internal consistency reliability of the instrument sought using Kuder-Richardson's formula 21 had value of 0.81. Three research questions and one null hypothesis guided the study. Research questions were answered using frequencies, mean infit/outfit statistic, mean item difficulty and mean standard error. The null hypothesis was tested at .05 level of significance using Mantel-Haenszel's Chi-square probability value. The result indicated that fractional-polarization scoring was adopted for two items. 25 in 28 items fit the partial credit model. The items were moderately difficult and did not exhibit gender-based differential item functioning. It is recommended that West African Examinations Council should remove fractional-dichotomization scoring and breakup marks allocated to an item into more than two groups for all items in future Physics essay marking scheme.

2025

Kajian rintis dilakukan untuk mengesahkan dan memeriksa kebolehpercayaan instrumen untuk kompetensi pengajar TVET. Instrumen ini terdiri daripada 60 item dan diedarkan kepada 30 orang pengajar TVET yang terlibat dengan pengajaran teknikal... more

Kajian rintis dilakukan untuk mengesahkan dan memeriksa kebolehpercayaan instrumen untuk kompetensi pengajar TVET. Instrumen ini terdiri daripada 60 item dan diedarkan kepada 30 orang pengajar TVET yang terlibat dengan pengajaran teknikal di Institut Latihan Perindustrian Pasir Gudang, Johor. Instrumen ini dibangunkan bagi mengukur tiga (3) konstruk; i) aspek teknikal ii)aspek pembelajaran dan metodologi dan iii) aspek kemanusiaan dan sosial. Melalui pendekatan ini, kebolehkepercayaan responden dan kebolehpercayaan item diukur dan ia lebih kukuh berbanding hanya melihat dari nilai Alpha Cronbachs’. Pendekatan Model Pengukuran Rasch (menggunakan Winstep Versi 3.69. 1. 11) mengukur dari aspek kebolehpercayaan dan pengasingan item-responden, polariti dan kesesuaian item mengukur konstruk serta nilai korelasi residual terpiawai. Pendekatan ini membenarkan penyikiran item yang tidak mengikut syarat diagnosis yang telah dilakukan. Hasil daripada analisis akhir, daripada 60 item sebanyak 7...

2025, Journal of Rehabilitation Medicine

Variables present in an individual, for example, independence, pain, balance, fatigue, depression and knowledge, cannot be measured directly (hence the term "latent" variables). They are usually assessed by measuring related behaviours,... more

Variables present in an individual, for example, independence, pain, balance, fatigue, depression and knowledge, cannot be measured directly (hence the term "latent" variables). They are usually assessed by measuring related behaviours, defined by sets of standardized items. The homogeneity of the different items, and proportionality of raw counts to measure, can only be postulated. In 1960 Georg Rasch proposed a statistical model that complied with the fundamental assumptions made in measurements in physical sciences. It allowed for the transformation of the cumulative raw scores (achieved by a subject across items, or by an item across subjects) into linear continuous measures of ability (for subjects) and difficulty (for items). These 2 parameters, only, govern the probability that "pass" rather than "fail" occurs. The discrepancies between modelexpected scores (continuous between 0 and 1) and observed scores (discrete, either 0 or 1) provide indexes of inconsistency of individual subjects, items and classes of subjects. In subsequent years the same principles were extended to rating scales, with items graded on more than 2 levels, and to "many-facet" contexts where, beyond items and subjects, multiple raters, times of administration, etc. converge in determining the observed scores. Rasch modelling has increasing application in rehabilitation medicine. New scales with unprecedented metric validity (including internal consistency and reliability) can be built. Existing scales can be improved or rejected on a sound theoretical basis. In clinical trials the consistency and the linearity of measures of either subjects or raters can be validly matched with those of physical and chemical measures. The stability of the item difficulties across time, cultures, diagnostic groups and time of administration can be estimated, thus making it possible to compare homogeneous measures or foster diagnostic procedures on the reasons for differential item functioning.

2025, International Education Journal

This study explored the factors that influence the perceived complexity of vocational rehabilitation tasks and the abilities of workplace supervisors and rehabilitating employees to carry out rehabilitation in the workplace. The research... more

This study explored the factors that influence the perceived complexity of vocational rehabilitation tasks and the abilities of workplace supervisors and rehabilitating employees to carry out rehabilitation in the workplace. The research project was designed to explore ...

2025, International Journal of Evaluation and Research in Education (IJERE)

In the context of significant changes in higher education and the job market, there has been extensive discussion on what qualifies graduate competency and what shapes graduates' labor market outcomes. Each university's vision is to... more

In the context of significant changes in higher education and the job market, there has been extensive discussion on what qualifies graduate competency and what shapes graduates' labor market outcomes. Each university's vision is to produce highly competitive and educated graduates with high competence and contribute to the country's development. Graduate employability is a key issue for higher education. Ensuring their competency is vital in forming an educated graduate the industry is looking for. Their competency is honed based on the activities and curriculum of the program as embedded in the circular memorandum order (CMO) of each program. A descriptive research design was used and a questionnaire on structured institutionalized tracer instrument and CMO 17 s.2017 was adopted. Uses statistical treatment such as mean, frequency, percentages and t-tests. This study focuses to assess and evaluate the competency of our graduates in response to the needs of the industry and for curriculum enhancement. The results reveal that the bachelor of science in business administration (BSBA) graduate's competency based on all the identified parameters was deemed "very effective" and useful in their respective workplace. Though the results highlighted research and extension as "very effective", their importance to employment shows that they are highly significant corresponding to the present trend. Despite all the training and exposure, the college provides them, still need improvement and commends to enhance the curriculum, improve instruction delivery, and upgrade graduate competencies.

2025, International Journal of Research in English Education

Performance testing including the use of rating scales has become widespread in the evaluation of second/foreign oral language assessment. However, no study has used Multifaceted Rasch Measurement (MFRM) including the facets of test... more

Performance testing including the use of rating scales has become widespread in the evaluation of second/foreign oral language assessment. However, no study has used Multifaceted Rasch Measurement (MFRM) including the facets of test takers’ ability, raters’ severity, group expertise, and scale category, in one study. 20 EFL teachers scored the speaking performance of 200 test-takers prior and subsequent to a rater training program using an analytic rating scale consisting of fluency, grammar, vocabulary, intelligibility, cohesion, and comprehension categories. The outcome demonstrated that the categories were at different levels of difficulty even after the training program. However, this outcome by no means indicated the uselessness of the training program since data analysis reflected the constructive influence of training in providing enough consistency in raters’ rating of each category of the rating scale at the post-training phase. Such an outcome indicated that raters could discriminate the various categories of the rating scale. The outcomes also indicated that MFRM can result in enhancement in rater training and functionality validation of the rating scale descriptors. The training helped raters use the descriptors of the rating scale more efficiently of its various band descriptors resulting in a reduced halo effect. The findings conveyed that stakeholders had better establish training programs to assist raters in better use of the rating scale categories of various levels of difficulty in an appropriate way. Further research could be done to make a comparative analysis between the outcome of this study and the one using a holistic rating scale in oral assessment.

2025, Journal of Investigative Dermatology

The Dermatology Life Quality Index (DLQI) is a widely used health-related quality of life measure. However, little research has been conducted on its dimensionality. The objectives of the current study were to apply Rasch analysis to DLQI... more

The Dermatology Life Quality Index (DLQI) is a widely used health-related quality of life measure. However, little research has been conducted on its dimensionality. The objectives of the current study were to apply Rasch analysis to DLQI data to determine whether the scale is unidimensional, to assess its measurement properties, test the response format, and determine whether the measure exhibits differential item functioning (DIF) by disease (atopic dermatitis versus psoriasis), gender, or age group. The results show that there were several problems with the scale, including misfitting items, DIF by disease, age, and gender, disordered response thresholds, and inadequate measurement of patients with mild illness. As the DLQI did not benefit from the application of Rasch analysis in its development, it is argued that a new measure of disability related to dermatological disease is required. Such a measure should use a coherent measurement model and ensure that items are relevant to all potential respondents. The current use of the DLQI as a guide to treatment selection is of concern, given its inadequate measurement properties.

2025, Measurement and Evaluation in Counseling and Development

Objective: This study investigated whether items on the simplified version of the Beck depression Inventory (BDI-S) exhibited differential functioning across age, gender, and clinical status, possibly compromising the fairness and... more

Objective: This study investigated whether items on the simplified version of the Beck depression Inventory (BDI-S) exhibited differential functioning across age, gender, and clinical status, possibly compromising the fairness and diagnostic accuracy of its scores.
Method: To identify differential item functioning (dIF) of the scale, partial credit tree (PCtree) model was used. The PCtree uses respondents' covariates to detect DIF in a data-driven manner. The analysis was conducted on responses from 4,521 German respondents (both clinical and non-clinical).
Results: The PCtree analysis yielded 17 non-predefined nodes, with different item difficulty patterns. DIF was observed as a function of the interaction between gender and age, with clinical status showing no significant impact on item performance. the results indicated that four items (Items 1 (sadness), 10 (crying), 14 (look unattractive), and 18 (changes in appetite)) exhibited large DIF.
Conclusion: The findings suggest the need for age-and gender-specific scoring norms to improve diagnostic accuracy.

2025, ARELE: Annual Review of English Language Education in Japan

There has been much research cenducted to compare open-ended and multipie-choice tests from the viewpoints of construct and difficulty. However, almost no studies have examined the eifects of question types in relation to test forrnats.... more

There has been much research cenducted to compare open-ended and multipie-choice tests from the viewpoints of construct and difficulty. However, almost no studies have examined the eifects of question types in relation to test forrnats. Through two experiments, this study investigated how question types influence the dienculty of these two test fbrmats. 1'he results of Experiment 1 showed that question types affected item dithculty in open-ended tests; more specifica]ly, thematic questions were the most difficult, fo11owed by inference questions, and paraphrase questions were the easiest. In contrast, the result of Experiment 2, in which the same tests were conducted in the multiple-choice test fbrmat, revealed that item difficulty did not ditfer significant]y by question type, In addition, we fbund that predictability of the results of the multiple-choice test vv'as low compared to the open-ended test, Comparison ot' these two Japan Society of English Language Education NII-Electronic Library Service JapanSociety wider range oftextual infbrmation. This difftrence is knoivn to change the difficulty of reading comprehension. Wilson (1979) and Ybshida (1998), fbr instance, elucidated that questions w'hich require inferenee ability were more difficult than those which only need the skill to grasp literal meaning. This tendency was demonstrated by Kobayashi (2002) and Wang ( ), who suggested that questions which required the readers to understand textually-cxplicit infbrrnation were easi¢ r than those which required readers to integrate the infbrmation or to understand the paraphrased ideas of the text, Another study (Shimizu, 2006), which used MC tests, alse showed that questions requiring higher-level skills were more diflicult than items which demanded only lower-level skills, In her classification, higher-level skills included inference, understanding

2025, Tattva Journal of Philosophy, volume 17, no. 1

The debate between Jacques Derrida and Paul Ricoeur on the philosophical status of metaphor has been seen as between two positions, one which privileges the destabilizing power of the metaphoric over the conceptual (Derrida) and the other... more

The debate between Jacques Derrida and Paul Ricoeur on the philosophical status of metaphor has been seen as between two positions, one which privileges the destabilizing power of the metaphoric over the conceptual (Derrida) and the other which domesticates the metaphoric in the service of the conceptual (Ricoeur). Commentators on this debate, no matter where their sympathies lie, seem to predominantly be in agreement on this issue. In this paper I attempt to invert the frame within which this debate has been viewed. I argue that the debate can more fruitfully be read not as one on the status of metaphor in philosophy, but rather on the task of concept-construction in philosophy. I also argue that in reading this debate from this perspective, we come across a rather surprising conclusion: that it is Derrida, rather than Ricoeur, who provides us with a more robust and profitable mode of concept-construction that can accommodate scientific revolutions, epistemological breaks, and paradigm shifts. Ricoeur's model of concept construction, I argue, only functions within what Thomas Kuhn has called 'normal science'.

2025

scales, and nourished in Geoff Masters' doctoral dissertation. The analysis of partial credit data was original with Geoff. The kind of work we discuss leans heavily on computing. We are deeply indebted to Larry Ludlow for his many... more

scales, and nourished in Geoff Masters' doctoral dissertation. The analysis of partial credit data was original with Geoff. The kind of work we discuss leans heavily on computing. We are deeply indebted to Larry Ludlow for his many valuable contributions to our main computer program, CREDIT. The companionship, constructive criticism and creative participation of able colleagues has played an especially important part in our work. We are particularly grateful to our MESA colleagues Richard Smith, Tony Kalinowski, Kathy Sloane and Nick Bezruczko. Bruce Choppin and Graham Douglas helped us to make the writing clearer and the algebra more correct.

2025

This research aims at analyzing the comparison of descriptive statistical Parameter Estimation stability using raw scores and Rasch model. The empirical data were the responses of the 12 th Grade Science Students of Senior High School on... more

This research aims at analyzing the comparison of descriptive statistical Parameter Estimation stability using raw scores and Rasch model. The empirical data were the responses of the 12 th Grade Science Students of Senior High School on Science Literary Test based on the integrated mathematics and natural sciences conducted at SMAN 2 and SMAN 3 Tegal, Central Java. This research employed a bootstrapping method assisted with SPSS version 21, while the Rasch model was assisted with R program version 3.6.3 eRm package Version 1.0.1. The parameter stability estimation was seen from its error standard and bias scores. The scores using the Rasch model was proven giving higher stability when compared to that using the raw scores in its descriptive statistical parameter estimation both from its error standard and bias aspects. Based on the error standard used, it showed that the mean and the standard deviation estimation when using the Rasch model scores was around 8 times more stable when compared to that when using the raw scores, while its median estimation were 16-18 times. Due to the use of bias measurement, it showed that the mean estimation when using the Rasch model scores was 6-10 times more stable when compared to that using the raw scores, while its median estimation and standard deviation were respectively 43-282 times and 7-10 times more stable.

2025, Health and Quality of Life Outcomes

BackgroundDisability is an increasingly important health-related outcome to consider as more individuals are now aging with Human Immunodeficiency Virus (HIV) and multimorbidity. The HIV Disability Questionnaire (HDQ) is a... more

BackgroundDisability is an increasingly important health-related outcome to consider as more individuals are now aging with Human Immunodeficiency Virus (HIV) and multimorbidity. The HIV Disability Questionnaire (HDQ) is a patient-reported outcome measure (PROM), developed to measure the presence, severity and episodic nature of disability among adults living with HIV. The 69-item HDQ includes six domains: physical, cognitive, mental-emotional symptoms and impairments, uncertainty and worrying about the future, difficulties with day-to-day activities, and challenges to social inclusion. Our aim was to develop a short-form version of the HIV Disability Questionnaire (SF-HDQ) to facilitate use in clinical and community-based practice among adults living with HIV.MethodsWe used Rasch analysis to inform item reduction using an existing dataset of adults living with HIV in Canada (n = 941) and Ireland (n = 96) who completed the HDQ (n = 1037). We evaluated overall model fit with Cronbach...

2025, Quality of life research : an international journal of quality of life aspects of treatment, care and rehabilitation

The Patient and Observer Scar Assessment Scale (POSAS) is a questionnaire that was developed to assess scar quality. It consists of two separate six-item scales (Observer Scale and Patient Scale), both of which are scored on a 10-point... more

The Patient and Observer Scar Assessment Scale (POSAS) is a questionnaire that was developed to assess scar quality. It consists of two separate six-item scales (Observer Scale and Patient Scale), both of which are scored on a 10-point rating scale. After many years of experience with this scale in burn scar assessment, it is appropriate to examine its psychometric properties using Rasch analysis. Cross-sectional data collection from seven clinical trials resulted in a data set of 1,629 observer scores and 1,427 patient scores of burn scars. We examined the person-item map, item fit statistics, reliability, response category ordering, and dimensionality of the POSAS. The POSAS showed an adequate fit to the Rasch model, except for the item surface area. Person reliability of the Observer Scale and Patient Scale was 0.82 and 0.77, respectively. Dimensionality analysis revealed that the unexplained variance by the first contrast of both scales was 1.7 units. Spearman correlation betwee...

2025, International Journal of Evaluation and Research in Education (IJERE)

Career identity is one of the most important psychosocial developmental tasks for adolescents. The development of career identity in adolescence will prevent humans from experiencing identity confusion which will have an impact on further... more

Career identity is one of the most important psychosocial developmental tasks for adolescents. The development of career identity in adolescence will prevent humans from experiencing identity confusion which will have an impact on further developmental tasks. This research answers the need for this measurement tool by developing and validating a career identity measurement tool called the Indonesian youth career identity scale (IYCIS). This instrument consists of 55 items in two aspects: career exploration and commitment. This study uses Rasch analysis to test the validity of the IYCIS construct. The construct validity test involved 200 high school students in Yogyakarta, Indonesia. Data analysis using Winstep software provides information about the quality of respondents and instruments, items that are easy and difficult for respondents to agree on, items that are made to order, and unidimensional. The results of applying the Rasch analysis show that IYCIS is good, precise, and has an item fit with the model. IYCIS is a reliable and valid measurement tool for accurately measuring the level of student career identity. This study discusses the implications and recommendations for further research on the implementation of guidance and counseling that contains career identity as a follow-up to IYCIS performance.

2025, Journal of Education and Learning

This research aims to construct and validate progress maps of digital technology for diagnosing the multidimensional mathematical proficiency (MP) in Number and Algebra for Grade 7 students utilizing the Construct Modeling Approach.... more

This research aims to construct and validate progress maps of digital technology for diagnosing the multidimensional mathematical proficiency (MP) in Number and Algebra for Grade 7 students utilizing the Construct Modeling Approach. Researchers employed four building blocks as follows. Firstly, researchers developed the progress maps as an assessment framework of multidimensional MP. This is followed by creating the test for diagnosing MP. Next, researchers assigned scoring criteria and created the transition points of students’ MP levels. Finally, researchers validated the quality of the progress maps through empirical evidence. A total sample 1,500 Grade 7 students was used to support the validity and reliability evidence of the progress maps through the Wright Map using Multidimensional Random Coefficients Multinomial Logit Model. Results revealed that there were two dimensions of progress maps, namely mathematical procedures (MAP) and structure of learning outcome (SLO), and the...

2025, Research Square (Research Square)

Globally, the leading cause of years lived with disability is low back pain (LBP). Chronic low back pain (CLBP) is responsible for most of the cost and disability associated with LBP. This is more devastating in low income countries,... more

Globally, the leading cause of years lived with disability is low back pain (LBP). Chronic low back pain (CLBP) is responsible for most of the cost and disability associated with LBP. This is more devastating in low income countries, particularly in rural Nigeria with one of the greatest global burdens of LBP. No Igbo back pain speci c measure captures remunerative or non-remunerative work outcomes. Disability measurement using these tools may not fully explain work-related disability and community participation, a limitation not evident in the World Health Organisation Disability Assessment Schedule (WHODAS 2.0). This study aimed to cross-culturally adapt the WHODAS 2.0 and validate it in rural and urban Nigerian populations with CLBP. Translation, cultural adaptation, test-retest, and cross-sectional psychometric testing was performed. WHODAS 2.0 was forward and back translated by clinical/non-clinical translators. Expert review committee evaluated the translations. Twelve people with CLBP in a rural Nigerian community piloted/pre-tested the questionnaire. Cronbach's alpha assessing internal consistency; intraclass correlation coe cient and Bland-Altman plots assessing test-retest reliability; and minimal detectable change were investigated in a convenient sample of 50 adults with CLBP in rural and urban Nigeria. Construct validity was examined using Spearman's correlation analyses with the back-performance scale, Igbo Roland Morris Disability Questionnaire and eleven-point box scale; and exploratory factor analysis in a random sample of 200 adults with CLBP in rural Nigeria. Ceiling and oor effects were investigated in both samples. Patient instructions were also translated. 'Waist pain/lower back pain' was added to 'illness(es)' to make the measure relevant for this study whilst allowing for future studies involving other conditions. The Igbo phrase for 'family and friends' was used to better represent 'people close to you' in item D4.3. The Igbo-WHODAS had good internal consistency (α = 0.75-0.97); intra class correlation coe cients (ICC = 0.81-0.93); standard error of measurements (5.05-11.10) and minimal detectable change (13.99-30.77). Igbo-WHODAS correlated moderately with performance-based disability, self-reported back pain-speci c disability and pain intensity, with a seven-factor structure and no oor and ceiling effects. Igbo-WHODAS appears psychometrically sound. Its research and clinical utility require further testing.

2025, The Journal of rheumatology

Discussion and endorsement of the OMERACT total joint replacement (TJR) core domain set for total hip replacement (THR) and total knee replacement (TKR) for endstage arthritis; and next steps for selection of instruments. The OMERACT TJR... more

Discussion and endorsement of the OMERACT total joint replacement (TJR) core domain set for total hip replacement (THR) and total knee replacement (TKR) for endstage arthritis; and next steps for selection of instruments. The OMERACT TJR working group met at the 2016 meeting at Whistler, British Columbia, Canada. We summarized the previous systematic reviews, the preliminary OMERACT TJR core domain set and results from previous surveys. We discussed preliminary core domains for TJR clinical trials, made modifications, and identified challenges with domain measurement. Working group participants (n = 26) reviewed, clarified, and endorsed each of the inner and middle circle domains and added a range of motion domain to the research agenda. TJR were limited to THR and TKR but included all endstage hip and knee arthritis refractory to medical treatment. Participants overwhelmingly endorsed identification and evaluation of top instruments mapping to the core domains (100%) and use of sub...

2025, US neurology

The publication of this article was supported by Grifols. The views and opinions expressed in the article are those of the authors and not necessarily those of Grifols. US/ GX/1016/0386 Chronic inflammatory demyelinating polyneuropathy... more

The publication of this article was supported by Grifols. The views and opinions expressed in the article are those of the authors and not necessarily those of Grifols. US/ GX/1016/0386 Chronic inflammatory demyelinating polyneuropathy (CIDP) is an acquired immune-mediated disease that evolves in a progressive or relapsing pattern over months to years. Although "typical" CIDP is characterized by symmetric proximal and distal motor and sensory deficits, it is now recognized that multifocal (asymmetric), distally predominant, pure sensory, and pure motor variants also fall within the CIDP spectrum. First-line treatment options for CIDP include corticosteroids, intravenous immunoglobulin (IVIG), and plasmapheresis (plasma exchange). For patients refractory to first-line options or those chronically dependent on high-dose first-line therapy, no evidence-based treatment recommendations exist. Cytotoxic

2025, Health and Quality of Life Outcomes

Background: To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. Methods: A pool of 48 mobility items... more

Background: To develop and validate an item bank to measure mobility in older people in primary care and to analyse differential item functioning (DIF) and differential bundle functioning (DBF) by sex. Methods: A pool of 48 mobility items was administered by interview to 593 older people attending primary health care practices. The pool contained four domains based on the International Classification of Functioning: changing and maintaining body position, carrying, lifting and pushing, walking and going up and down stairs. The Late Life Mobility item bank consisted of 35 items, and measured with a reliability of 0.90 or more across the full spectrum of mobility, except at the higher end of better functioning. No evidence was found of non-uniform DIF but uniform DIF was observed, mainly for items in the changing and maintaining body position and carrying, lifting and pushing domains. The walking domain did not display DBF, but the other three domains did, principally the carrying, lifting and pushing items. Conclusions: During the design and validation of an item bank to measure mobility in older people, we found that strength (carrying, lifting and pushing) items formed a secondary dimension that produced DBF. More research is needed to determine how best to include strength items in a mobility measure, or whether it would be more appropriate to design separate measures for each construct.

2025, Assessment & Evaluation in Higher Education

Over recent years UK medical schools have moved to more integrated summative examinations. This paper analyses data from the written assessment of undergraduate medical students to investigate two key psychometric aspects of this type of... more

Over recent years UK medical schools have moved to more integrated summative examinations. This paper analyses data from the written assessment of undergraduate medical students to investigate two key psychometric aspects of this type of high stakes assessment. Firstly, the strength of the relationship between examiner predictions of item performance (as required under the Ebel standard setting method employed) and actual item performance ('facility') in the examination is explored. It is found that there is a systematic pattern of difference these two measures, with examiners tending to under-estimate the difficulty of items classified as relatively easy, and over-estimating that of items classified harder. The implications of these differences for standard setting are considered. Secondly, the integration of the assessment raises the question as to whether the student total score in the exam can provide a single meaningful measure of student performance across a broad range of medical specialties. Therefore Rasch

2025, Journal of Technology and Science Education

This study explores the evolving interaction between Generative Artificial Intelligence (AI) and education, focusing on how technologies such as Natural Language Processing and specific models like OpenAI's ChatGPT can be used on... more

This study explores the evolving interaction between Generative Artificial Intelligence (AI) and education, focusing on how technologies such as Natural Language Processing and specific models like OpenAI's ChatGPT can be used on high-stakes examinations. The main objective is to evaluate the ability of ChatGPT version 4.0 to generate written language assessment items and compare them to those created by human experts. The pilot items were developed for the Higher Education Entrance Examination (ExIES, according to its Spanish initials) administered at the Autonomous University of Baja California. Item Response Theory (IRT) analyses were performed on responses from 2,263 test-takers. Results show that although ChatGPT-generated items tend to be more challenging, both sets exhibit a comparable Rasch model fit and discriminatory power across varying levels of student ability. This finding suggests that Generative AI can effectively complement exam developers in creating large-scale assessments. Furthermore, ChatGPT 4.0 demonstrates a slightly higher capacity to differentiate among students of varying skill levels. In conclusion, the study underscores the importance of continually exploring AI-driven item generation as a potential means to enhance educational assessment practices and improve pedagogical outcomes.

2025

This step-by-step guide to conducting a Rasch analysis using the jMetrik software package describes how to format and import data, conduct a simple classical test theory item analysis, and then conduct a simple Rasch analysis. Some basic... more

This step-by-step guide to conducting a Rasch analysis using the jMetrik software package describes how to format and import data, conduct a simple classical test theory item analysis, and then conduct a simple Rasch analysis. Some basic graphical outputs are also described. After completing the steps described in this guide, novice users should be able to conduct a basic analysis unaided and be able to explore the more advanced features of jMetrik by referring to the manual.

2025, Journal of Advanced Nursing

A study examining the appropriateness of a self-rated alcoholrelated clinical confidence tool as a method of measurement among registered hospital nurses using Rasch analysis.

2025, Journal of Rehabilitation Medicine

Objective: To examine the impact of home modifications on self-rated ability in everyday life from various aspects for people ageing with disabilities. Methods: The study sample was recruited from an agency providing home modification... more

Objective: To examine the impact of home modifications on self-rated ability in everyday life from various aspects for people ageing with disabilities. Methods: The study sample was recruited from an agency providing home modification services in Sweden and comprised 73 subjects whose referrals had been approved and who were scheduled to receive home modifications (intervention group) and 41 subjects waiting for their applications to be assessed for approval (comparison group). The subjects rated their ability in everyday life using the Client-Clinician Assessment Protocol Part I on 2 occasions: at baseline and follow-up. The Client-Clinician Assessment Protocol Part I provides data on the clients' self-rated independence, difficulty and safety in everyday life. The data were first subjected to Rasch analysis in order to convert the raw scores into interval measures. Further analyses to investigate changes in self-rated ability were conducted with parametric statistics. Results: Subjects who had received home modifications reported a statistically significant improvement in their selfrated ability in everyday life compared with those in the comparison group. Subjects who had received home modifications reported less difficulty and increased safety, especially in tasks related to self-care in the bathroom and transfers, such as getting in and out of the home. Conclusion: Home modifications have a positive impact on self-rated ability in everyday life, especially on decreasing the level of difficulty and increasing safety.

2025

Introduction: Problematic internet use (PIU) can present itself in a variety of online activities. Given the increasing prevalence of PIU among young adults, there is a dearth of comprehensive assessment tools to characterize various PIU... more

Introduction: Problematic internet use (PIU) can present itself in a variety of online activities. Given the increasing prevalence of PIU among young adults, there is a dearth of comprehensive assessment tools to characterize various PIU in Malaysia. The 11‑item Assessment of Criteria for Specific Internet‑use Disorders (ACSID‑11) assesses specific PIU including online gaming, online buying‑shopping, online pornography use, social networking use, and online gambling. The present study investigated the psychometric properties of the Malay ACSID‑11. Methods: A cross‑sectional study using an online survey was used for the data collection. The sample comprised 610 young adults aged 22.55 years (standard deviation ± 3.49). Participants were recruited from July 2023 to September 2023 using convenience sampling. Results: The confirmatory factor analysis findings supported the four‑factor structure of the Malay ACSID‑11 across gender, ethnicity, and academic achievement with good fit statistics: comparative fit index (CFI) ≥ 0.968, Tucker‑Lewis index (TLI) ≥ 0.949, root mean square error of approximation (RMSEA) ≥ 0.057, standardized root mean square residual (SRMR) ≥ 0.028 (frequency response); CFI ≥ 0.968, TLI ≥ 0.958, RMSEA ≥ 0.079, SRMR ≥ 0.033 (intensity response). The different online subscales (except for some of the ACSID‑11 online gambling subscales) showed good internal consistency (Cronbach’s α and McDonald’s ω between 0.58 and 0.90 for frequency responses; Cronbach’s α and McDonald’s ω between 0.61 and 0.93 for intensity responses). Conclusion: The Malay ACSID‑11 is a valid and reliable instrument for assessing various specific PIU among Malaysian young adults. However, caution is required using the ACSID‑11 to assess online gambling because some of its subscales had low internal consistency

2025, International Education Studies

This study explored the psychometric properties of a locally developed information skills test for youth students in Malaysia using Rasch analysis. The test was a combination of 24 structured and multiple choice items with a 4-point... more

This study explored the psychometric properties of a locally developed information skills test for youth students in Malaysia using Rasch analysis. The test was a combination of 24 structured and multiple choice items with a 4-point grading scale. The test was administered to 72 technical college students and 139 secondary school students. The data from the test were fitted to the Rasch partial credit model using the Winsteps program in which the unidimensionality, reliability and person-item distribution map of the test were examined. The analysis showed all 24 items meet the Rasch model expectation and thus have a potential in assessing information skills of youth students in Malaysia. The findings showed that Rasch analysis could help researchers to refine the developed test in a systematic and informed manner.

2025, Health and quality of life outcomes

Background There is no widely accepted framework to guide the development of condition-specific preferencebased instruments (CSPBIs) that includes both de novo and from existing non-preference-based instruments. The purpose of this study... more

Background There is no widely accepted framework to guide the development of condition-specific preferencebased instruments (CSPBIs) that includes both de novo and from existing non-preference-based instruments. The purpose of this study was to address this gap by reviewing the published literature on CSPBIs, with particular attention to the application of item response theory (IRT) and Rasch analysis in their development. Methods A scoping review of the literature covering the concepts of all phases of CSPBI development and evaluation was performed from MEDLINE, Embase, PsychInfo, CINAHL, and the Cochrane Library, from inception to December 30, 2022. The titles and abstracts of 1,967 unique references were reviewed. After retrieving and reviewing 154 full-text articles, data were extracted from 109 articles, representing 41 CSPBIs covering 21 diseases or conditions. The development of CSPBIs was conceptualized as a 15-step framework, covering four phases: 1) develop initial questionnaire items (when no suitable non-preference-based instrument exists), 2) establish the dimensional structure, 3) reduce items per dimension, 4) value and model health state utilities. Thirty-nine instruments used a type of Rasch model and two instruments used IRT models in phase 3. We present an expanded framework that outlines the development of CSPBIs, both from existing nonpreference-based instruments and de novo when no suitable non-preference-based instrument exists, using IRT †

2025, AnV Publication

The nursing profession is frequently described as both a calling and a career. The modern workplace, however, is increasingly impacted by the mentality of "hustle culture"a mindset that prioritises unrelenting production, overwork, and a... more

The nursing profession is frequently described as both a calling and a career. The modern workplace, however, is increasingly impacted by the mentality of "hustle culture"a mindset that prioritises unrelenting production, overwork, and a never-ending chase of success. This culture offers unique challenges in nursing, which is a naturally difficult profession owing to the high risks involved. While hustle culture can foster professional growth and resiliency, it can also exacerbate fatigue, mental health issues, and subpar patient care. This article investigates the consequences of hustle culture in nursing, assesses its benefits and drawbacks, and suggests solutions for mitigating its negative effects.

2025, International Journal of Evaluation and Research in Education (IJERE)

Every society dreams of true peace. To achieve true peace, humans need to start with inner peace. The importance of peace becomes one of the bases for developing a measure of peace for designing peace-building programs. This research... more

Every society dreams of true peace. To achieve true peace, humans need to start with inner peace. The importance of peace becomes one of the bases for developing a measure of peace for designing peace-building programs. This research answered the need for these measuring tools by developing and validating a peace measuring instrument called the Indonesian peace of mind scale (IPoMS). This instrument consists of seven items in two aspects: the internal state of peacefulness and harmony. This study used Rasch analysis to test the construct validity of IPoMS. The construct validity test involved 202 vocational high school students in Yogyakarta, Indonesia. Data analysis using Win step software provides information about the quality of respondents and instruments, items that are easy and difficult for respondents to agree on, fit order items, and unidimensionality. The results of the application of Rasch analysis showed that IPoMS is good, precise, and have item conformity with the mode...

2025, Journal of College Teaching & Learning (TLC)

The purpose of this article is to provide insight into an elementary school whose climate issues appear to plague and impact it's performance as measured by it's Annually Yearly Progress (AYP). The Northwest Georgia elementary... more

The purpose of this article is to provide insight into an elementary school whose climate issues appear to plague and impact it's performance as measured by it's Annually Yearly Progress (AYP). The Northwest Georgia elementary school is located in a rural school system approximately 50 miles northwest of Atlanta, Georgia. A review of the literature suggests school climate can affect many areas and people within schools. It further suggests that positive interpersonal relationships and optimal learning opportunities in all demographic environments can increase school achievement levels and reduce maladaptive behaviors (McEvoy & Welker, 2000). Providing a positive and supportive work environment and climate for faculty and staff, more often than not, improves faculty, staff and student performance (Freiberg, 1998). An in-depth analysis of the environment of the school in question suggests a lack of faculty and staff respect for administration, a hostile work environment, and o...

2025, School Science and Mathematics

Developing an understanding procedures observation rubric for mathematics intervention teachers. School Science and Mathematics, 120(3),[153][154][155][156][157][158][159][160][161][162][163][164]

2025, Value in Health

This paper discusses recent advances that have been made in the field of psychometrics, specifically, the application of Rasch analysis to the instrument development process. It emphasizes the importance of assessing the fundamental... more

This paper discusses recent advances that have been made in the field of psychometrics, specifically, the application of Rasch analysis to the instrument development process. It emphasizes the importance of assessing the fundamental scaling properties of an instrument prior to consideration of traditional psychometric indicators. The paper introduces Rasch analysis and shows how it has been applied in the development of needs-based measures in order to ensure that they provide unidimensional measurement. By ensuring that scales are based on the same measurement model and that they fit the Rasch model it is possible for QoL scores to be compared across diseases by means of cocalibration and item banking.

2025

This study investigated the impact of graphic design skills on the employability of business education students in Nigerian colleges of education. Employing a true experimental pre-test and post-test withinsubjects design, the study... more

This study investigated the impact of graphic design skills on the employability of business education students in Nigerian colleges of education. Employing a true experimental pre-test and post-test withinsubjects design, the study involved 40 randomly selected students from the Federal College of Education (Technical), Bichi. Baseline assessments were conducted to measure the participants' graphic design competencies and perceptions of employability. The intervention consisted of a six-week Canva-based training program focused on designing flyers and invitation cards. Internal validity was ensured through randomization, control of confounding variables, and consistent implementation procedures, while external validity was strengthened through representative sampling, real-world application, and potential for replication. Data were collected through structured surveys and analyzed using descriptive statistics and paired t-tests. The findings revealed statistically significant improvements in both graphic design skills and employability perceptions following the intervention. Consequently, the null hypotheses were rejected, affirming that graphic design training positively influences employability. Notable challenges included limited access to Canva tools, difficulties in balancing academic workload, and potential threats to validity arising from the study's unique context. The study recommends the integration of graphic design competencies into the business education curriculum to enhance graduate employability. Despite its limitations, the study underscores the value of graphic design training in equipping business education students for the competitive Nigerian job market.

2025, Health and Quality of Life Outcomes

Background: Existing instruments for measuring mobility are inadequate for accurately assessing older people across the broad spectrum of abilities. Like other indices that monitor critical aspects of health such as blood pressure tests,... more

Background: Existing instruments for measuring mobility are inadequate for accurately assessing older people across the broad spectrum of abilities. Like other indices that monitor critical aspects of health such as blood pressure tests, a mobility test for all older acute medical patients provides essential health data. We have developed and validated an instrument that captures essential information about the mobility status of older acute medical patients. Methods: Items suitable for a new mobility instrument were generated from existing scales, patient interviews and focus groups with experts. 51 items were pilot tested on older acute medical inpatients. An interval-level unidimensional mobility measure was constructed using Rasch analysis. The final item set required minimal equipment and was quick and simple to administer. The de Morton Mobility Index (DEMMI) was validated on an independent sample of older acute medical inpatients and its clinimetric properties confirmed. The DEMMI is a 15 item unidimensional measure of mobility. Reliability (MDC 90 ), validity and the minimally clinically important difference (MCID) of the DEMMI were consistent across independent samples. The MDC 90 and MCID were 9 and 10 points respectively (on the 100 point Rasch converted interval DEMMI scale). The DEMMI provides clinicians and researchers with a valid interval-level method for accurately measuring and monitoring mobility levels of older acute medical patients. DEMMI validation studies are underway in other clinical settings and in the community. Given the ageing population and the importance of mobility for health and community participation, there has never been a greater need for this instrument.

2025, Archives of Physical Medicine and Rehabilitation

Objective: To investigate the validity of item score summation for the original and modified versions of the Barthel Index. Design: Rasch analysis of Barthel Index data. Setting: General medical wards at 2 acute care hospitals in... more

Objective: To investigate the validity of item score summation for the original and modified versions of the Barthel Index. Design: Rasch analysis of Barthel Index data. Setting: General medical wards at 2 acute care hospitals in Australia. Participants: Consecutive older medical patients (Nϭ396). Interventions: Not applicable. Main Outcome Measures: Activity limitation was assessed by using the Barthel Index at hospital admission and discharge. At 1 hospital site, the original Barthel Index was used, and at the other hospital site the Modified Barthel Index (MBI) was used. Results: More than half of the items showed misfit to the Rasch model for both versions of the Barthel Index. The continence items appear to measure a different construct to the other items. After the removal of the continence items, data for the remaining items still did not fit the Rasch model. Neither the original nor the MBI are unidimensional scales. An exception to this occurred when the original Barthel Index was rescored and only then for discharge and not for admission Barthel Index data. Conclusions: Because clinicians do not typically rescore outcomes obtained by using the Barthel Index, these findings, combined with unacceptable ceiling effects, render the Barthel Index an assessment tool with limited validity for measuring and monitoring the health of older medical patients.

2025

This paper briefly looks into the role and extent of mathematical modelling in the design and analysis of measurement systems, especially measurement subsystems in the form of instruments and instrument elements. It also examines the role... more

This paper briefly looks into the role and extent of mathematical modelling in the design and analysis of measurement systems, especially measurement subsystems in the form of instruments and instrument elements. It also examines the role and use of mathematical modelling in the area of soft measurement (non-physical measurement). Based on a number of examples it demonstrates the use of modern modelling techniques in the design and analysis of sub-systems in measurement technology. In doing so, it will focus on the scope and importance of physical modelling at a sub-system level which ultimately contributes to modelling activities at a global systems level.

2025

This paper briefly looks into the role and extent of mathematical modelling in the design and analysis of measurement systems, especially measurement sub- systems in the form of instruments and instrument elements. It also examines the... more

This paper briefly looks into the role and extent of mathematical modelling in the design and analysis of measurement systems, especially measurement sub- systems in the form of instruments and instrument elements. It also examines the role and use of mathematical modelling in the area of soft measurement (non-physical measurement). Based on a number of examples it demonstrates the use of modern modelling techniques in the design and analysis of sub-systems in measurement technology. In doing so, it will focus on the scope and importance of physical modelling at a sub-system level which ultimately contributes to modelling activities at a global systems level.

2025

Much of the data presented by politicians and the media is multivariate in its nature. However, in the UK at least, the general public has little training to deal with such information. It is reasonable to explore the school curriculum to... more

Much of the data presented by politicians and the media is multivariate in its nature. However, in the UK at least, the general public has little training to deal with such information. It is reasonable to explore the school curriculum to determine the nature and extent of students' preparation for dealing with multivariate data. In the UK,

2025, Osteoarthritis and Cartilage

Objectives: Use Rasch analysis to examine the psychometric properties of the Oxford Knee Score (OKS), particularly in respect to unidimensionality, and consistency of item functioning before and after total knee replacement and across age... more

Objectives: Use Rasch analysis to examine the psychometric properties of the Oxford Knee Score (OKS), particularly in respect to unidimensionality, and consistency of item functioning before and after total knee replacement and across age and gender groups. The 12-item OKS was administered to 1,712 patients before the surgery, and 1,322 and 855 patients were administered the instrument repeatedly at the 6-month and 2-year postoperative assessments, respectively. Data were fitted to the Rasch partial credit model with the Winsteps program. Differential item functioning (DIF) analysis was performed, and fit statistics in combination with principal components analysis of the residuals were used to test the unidimensionality assumption. The fit criteria were set at 1.5 and 2.0 for infit mean-square (MNSQ) and outfit MNSQ, respectively. Results: At baseline, item difficulty ranged from À1.86 to 1.78 logits, and person measures had a mean AE SD of À0.01 AE 0.89. Misfit items were ''limping'' and ''night pain'' in preoperative data and ''limping'' and ''kneeling'' in postoperative data. After removing items limping and kneeling and recoding item night pain, none of the items misfit at each of the time points and there was stability of item difficulty ordering across time. In the modified OKS set, five items displayed DIF by age and three by gender. The original OKS had adequate targeting and good coverage of knee severity levels in preoperative patients. The modified 10-item OKS data fit the Rasch model and had stable item difficulty ordering over time.

2025, International Journal of Evaluation and Research in Education (IJERE)

This study aimed to test the validity, reliability, and difficulty level of items developed based on the Frayer model and detect conceptual understanding of high school students in biology evolution. The test method evaluated 35... more

This study aimed to test the validity, reliability, and difficulty level of items developed based on the Frayer model and detect conceptual understanding of high school students in biology evolution. The test method evaluated 35 multiple-choice questions on evolution for 55 high school students. Rasch analysis was performed to assess the validity, reliability, difficulty level of items, and students’ ability level. Two experts empirically tested and analyzed the validity of the items. The assessment developed was discovered to be valid based on expert and empirical analyses. Furthermore, the construct validity test indicated that only two of the 35 questions were deemed invalid. The assessment exhibited reliability with an item reliability score of 0.92. The item difficulty levels were equally spread across the normal curve, encompassing questions ranging from very difficult to very easy categories, as depicted in the variable map. After analyzing the map, it was observed that variations in students’ proficiency levels at answering questions were evident, indicating diverse levels of ability. Students performed well in handling formal and superordinate-subordinate level questions. However, their performance differed when dealing with identity and principle-level concepts.

2025

In rater-mediated assessments, the ratings awarded to language learners' written, or spoken, performances do not necessarily reflect their language abilities because a number of other construct-irrelevant factors may affect the knowledge... more

In rater-mediated assessments, the ratings awarded to language learners' written, or spoken, performances do not necessarily reflect their language abilities because a number of other construct-irrelevant factors may affect the knowledge they demonstrate. Rater subjectivity and rating scales are among the variables possibly influencing the final results. The purpose of the present study was to examine the extent to which university students' ratings on their essays mirrored the effect of these two factors. To that end, 150 Iranian EFL teachers rated ten five-paragraph essays BA students had written as their course requirements at Imam Khomeini International University. The raters used two rating scales to rate the essays on a number of assessment criteria. The study rested on a partial rating design, and the Rasch-based computer program, FACETS, was used to analyze the data. Results of Facets analyses showed raters differed considerably in the amounts of severity they exercised when rating the essays. The results also showed rater bias interactions with holistic rating scales. The implications of the findings for proposing procedures for reducing the effects of such extraneous variables are discussed.

2025, MTISD 2008. Methods, Models and Information Technologies for Decision Support Systems

Abstract: Multivariate Additive PLS Splines, in short MAPLSS, are Partial Least-Squares models that study the dependence of a set of responses on spline transformations of the predictor variables which permit to capture additively non... more

Abstract: Multivariate Additive PLS Splines, in short MAPLSS, are Partial Least-Squares models that study the dependence of a set of responses on spline transformations of the predictor variables which permit to capture additively non linear main effects and interactions. The aim of this paper is to present a way of selecting MAPLSS models through an adaptive incremental selection of training samples by a bootstrap procedure. This approach is attractive in the case of expensive data thus implying to construct efficient ...

2025

This document reviews the research related to students' and teachers' anxiety related to science and the teaching of science in order to better understand the relationships between the variables that can predict this phenomenon. The... more

This document reviews the research related to students' and teachers' anxiety related to science and the teaching of science in order to better understand the relationships between the variables that can predict this phenomenon. The research reports reviewed used either the Science State Trait Anxiety Inventory or the Science Teaching State Trait Anxiety Inventory in gathering their data. These inventories allow the researcher to change title headings within the inventory to allow the researcher to examine particular situations. Findings for the report are presented according to titles on the state anxiety scale that measure for anxiety about science and teaching science; anxiety about specific tasks; and anxiety about different science courses. The summary of the findings discusses the following variables that emerged from the analysis: (1) attitude toward science; (2) anxiety about teaching; (3) achievement; (4) examination format; (5) content courses; (6) achievement on a specific task in a content course; (7) gender; (8) confidence; (9) self efficacy; (10) demographic variables; (11) long term effects; (12) impact on teacher classroom performance; and (13) children's anxiety. Conclusions discuss the need for a model to explain how anxiety and related variables affect learning in science and for continued research in this area. A list of 28 references is included. (MDH)

2025, Evaluation Review

In assessing criminality, researchers have used counts of crimes, arrests etc. because interval measures were not available. Additionally, crime seriousness varies depending on demographic factors. This study examined the Crime and... more

In assessing criminality, researchers have used counts of crimes, arrests etc. because interval measures were not available. Additionally, crime seriousness varies depending on demographic factors. This study examined the Crime and Violence Scale (CVS) regarding: psychometric quality using item response theory (IRT); and invariance of the crime seriousness hierarchy for gender, age, and racial/ethnic groups on 7435 respondents. The CVS is a useful measure of criminality, though some items could be improved or dropped. Differential item functioning analysis revealed that crime seriousness varies by age and gender. IRT shows promise in assessing and adjusting for demographic variations in crime seriousness.

2025, International Journal of Evaluation and Research in Education (IJERE)

Self-regulated learning (SLR) is a condition in which students actively participate in the process of acquiring knowledge, and it closely relates to students’ metacognitive, motivational, and behavioral aspects. In order to measure this... more

Self-regulated learning (SLR) is a condition in which students actively participate in the process of acquiring knowledge, and it closely relates to students’ metacognitive, motivational, and behavioral aspects. In order to measure this variable, an instrument was developed by referring to the Zimmerman cycle in the form of a questionnaire. Therefore, this study aims to analyze the construct validity of SLR questionnaires designed for high school students through Rasch model analysis. The method employed was descriptive quantitative research. The analyzed questionnaire consists of 50 positive statements, rated on 4-point Likert scale, and arranged of forethought, performance, and self-reflection phases. Furthermore, the construct validity was conducted on 235 third grade (XII) high school students in Gunungsitoli City (Indonesia), with a gender distribution of 58.29% female and 41.70% male. The results showed that the questionnaire with 4-rating scales satisfied the criteria for validity, gender inclusiveness, and unidimensionality based on Rasch model analysis for 25 statements. The implication of this research shows that the SLR questionnaire developed is valid and can be used in wider field research, especially in mathematics learning.

2025, International Journal of Evaluation and Research in Education (IJERE)

The rising rate of youth unemployment and its attendant consequences on the general populace in Nigeria has assumed a frightening dimension. The academia and other relevant stakeholders have gradually come to realize that the possession... more

The rising rate of youth unemployment and its attendant consequences on the general populace in Nigeria has assumed a frightening dimension. The academia and other relevant stakeholders have gradually come to realize that the possession of academic qualifications alone cannot guarantee a good quality job. Nigerian higher intuitions are now introducing entrepreneurship studies in their school curriculum without a clear framework. Research has shown that this does not guarantee total graduates unless we have an entrepreneurship skills framework that is functional and discipline-based. Noting that entrepreneurship is classified into two folds; entrepreneurship specific and entrepreneurship mindset, this study intends to develop an entrepreneurship skills framework that will promote the employability of the students of Electrical Technology in Colleges of Education in Nigeria. The survey research that employs the sequential exploratory mixed methods was used for the study. The population for this phase consists of entrepreneurs and academics in Nigeria. The use of the partial credit model in the Rasch analysis model guaranteed the consensus of the experts on each of the items being measured. The outcome of the study will contribute to the social economic peace and sustainability in Nigeria and the body of knowledge in entrepreneurship regarding electrical technology.