Improving Scoring Consistency of Flight Performance through Inter-Rater Reliability Analyses

Matthew V. Smith; Mary C. Niemczyk; William K. McCurry

doi:10.22488/okstate.18.100368

PDF

Published: Oct 11, 2018

DOI: https://doi.org/10.22488/okstate.18.100368

Matthew V. Smith

Arizona State University

Mary C. Niemczyk

Arizona State University

William K. McCurry

Arizona State University

Abstract

Students, as well as the other stake-holders of flight schools, must be sure that the scoring of flight performance is such that the scores are a meaningful indicator of the studentâ€™s performance rather than an
arbitrary indicator of the instructorâ€™s perception. The scores should be somewhat consistent from one instructor to another. The apparent inconsistency in scoring from one instructor to another can be
examined by conducting inter-rater reliability (IRR) analyses. Inter-rater reliability measures the extent of agreement between two or more individual raters â€“ it is used to measure the consistency of a scoring or
rating system, and those who use it. This foundational investigation was designed to assess inter-rater reliability between instructor pilots when observing 10 sample flights performed by student pilots. Results
of the study indicated that inter-rater reliability was low. Suggestions for improving the consistency of flight instructor scoring are discussed, as well as recommendations for future research.

Issue

Vol. 26 No. 1 (2008)

Section

Peer-Reviewed Articles

Article Sidebar

Main Article Content

Abstract

Article Details