Intra- and inter-observer agreement with regard to describing adnexal masses using International Ovarian Tumor Analysis (IOTA) terminology: a reproducibility study involving seven observers (original) (raw)
2013, Ultrasound in Obstetrics & Gynecology
Objectives: To estimate intraobserver repeatability and interobserver agreement in assessing the presence of papillary projections in adnexal masses and in classifying adnexal masses using the International Ovarian Tumor Analysis (IOTA) terminology for ultrasound examiners with different levels of experience, to identify ultrasound findings that cause confusion and might be interpreted differently by different observers, and to determine if repeatability/agreement change after consensus has been reached on how to interpret "problematic" ultrasound images. Methods: Digital clips (two to eight clips per adnexal mass) with gray scale and color/power Doppler information of 83 adnexal masses in 80 patients were evaluated independently four times, twice before and twice after a consensus meeting, by four experienced and three less experienced ultrasound observers. The variables analyzed were tumor type (unilocular, unilocular solid, multilocular, multilocular solid, solid) and presence of papillary projections. Intraobserver repeatability was evaluated for each observer This article is protected by copyright. All rights reserved. Accepted Article (percentage agreement, Cohen´s Kappa). Interobserver agreement was estimated for all seven observers (percentage agreement, Fleiss Kappa, Cohen´s Kappa) Results: There was uncertainty/disagreement about how to define a solid component and a papillary projection, but consensus was reached at the consensus meeting. Interobserver agreement for tumor type was good both before and after the consensus meeting with no clear improvement after the consensus meeting, mean percentage agreement being 76.0% (Fleiss Kappa 0.695) before the consensus meeting and 75.4 % (Fleiss Kappa 0.682) after the consensus meeting. Interobserver agreement with regard to papillary projections was moderate both before and after the consensus meeting with no clear improvement after the consensus meeting, mean percentage agreement being 86.6% (Fleiss Kappa 0.536) before the consensus meeting and 82.7 % (Fleiss Kappa 0.487) after it. There was substantial variability in pairwise agreement for papillary projections (Cohen´s kappa 0.112-0.824). Intraobserver repeatability with regard to tumor type was very good and similar before and after the consensus meeting (agreement 87-95%, Kappa 0.83-0.94), that with regard to papillary projections was good or very good both before and after the consensus meeting (agreement 88-100%, Kappa 0.64-1.0).