Creation of Tonal and Speech Alarm Efficacy Scales

Main Article Content

Ryan A. Lange
Stephen Rice
Cameron M. E. Severin
Jonah S. L. Chiu
Keith J. Ruskin
Connor Rice


Alarms have been in use for decades in aviation; however, it is still the case that many alarms are sub-optimally designed and do not perform well. Some alarms are so poorly designed that they increase workload, confuse the user, and/or cause a severe loss of trust. When users are asked about alarm efficacy, they often say that the alarm is either good or bad. While this provides some useful subjective information, we would argue that a quantitative scale offers more value. Using a consensus research method to ensure construct validity, we solicited 2362 participants across a four-phased, one-year study in the development of a Tonal Alarm Efficacy Scale and a Speech Alarm Efficacy Scale. A factor analysis using principal components and varimax rotation provided strong evidence of validity, while Cronbach’s Alpha and Guttman’s Split Half tests were used to ensure high consistency and reliability, respectively. Follow-up analyses highlight the sensitivity of the scales. These types of quantitative scales can provide a means for users, designers, engineers, and human factors experts to communicate in a common language to design more effective alarms for our society. The present study attempts to fill a gap in the current literature by providing Tonal and Speech Alarm Efficacy Scales for use applications in aviation.

Article Details

Peer-Reviewed Articles


Arrabito, G. R., Mondor, T. A., & Kent, K. J. (2004). Judging the urgency of non-verbal auditory alarms: a case study. Journal of Ergonomics, 47(8), 821-840.

Bliss, J. P., Freeland, M. J., & Millard, J. C. (1999). Alarm related incidents in aviation: A survey of the aviation safety reporting system database. Proceedings of the Human Factors and Ergonomics Society. Annual Meeting, 1, 6.

Breznitz, S. (2013). Cry wolf: The psychology of false alarms. Psychology Press.

Buhrmester, Kwang, T., & Gosling, S. D. (2011). Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? Perspectives on Psychological Science, 6(1), 3–5.

Burns, N., & Grove, S. K. (1993). The practice of nursing research: Conduct, critique & utilization (2nd ed). Sanders.

Burt, J. L., Bartolome, D. S., Burdette, D. W., & Comstock, J. R. (1995). A psychophysiological evaluation of the perceived urgency of auditory warning signals. Ergonomics, 38(11), 2327-2340. doi:10.1080/00140139508925271

Carroll, J. M., & Olson, J. R. (1988). Mental models in human-computer interaction. Handbook of human-computer interaction, 45-65.

Clark, T. (2016, June 5). HTF Update: 2016 National Clinical Alarm Survey Results. AAMI Conference and Expo, Tampa, FL.

Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed). L. Erlbaum Associates.

Dixon, S. R., Wickens, C. D., & McCarley, J. S. (2007). On the independence of compliance and reliance: Are automation false alarms worse than misses? Human Factors, 49(4), 564-572.

Dorgo, G., Tandari, F., Szabó, T., Palazoglu, A., & Abonyi, J. (2021). Quality vs. quantity of alarm messages - How to measure the performance of an alarm system. Chemical Engineering Research and Design, 173, 63-80.

Edworthy, J. (2013). Medical audible alarms: A review. Journal of the American Medical Informatics Association: JAMIA, 20(3), 584-589. doi:10.1136/amiajnl-2012-001061

Hinkin, T. R. (1998). A brief tutorial on the development of measures for use in survey questionnaires. Organizational Research Methods, 1(1), 104–121.

International Organization for Standardization (2003). Ergonomics — Danger signals for public and work areas — Auditory danger signals (ISO Standard No. 7731:2003).

Jian, J. Y., Bisantz, A., & Drury, C. (2000). Foundations for an Empirically Determined Scale of Trust in Automated Systems. International Journal of Cognitive Ergonomics, 4, 53–71.

Lewandowska, K., Weisbrot, M., Cieloszyk, A., Mędrzycka-Dąbrowska, W., Krupa, S., & Ozga, D. (2020). Impact of Alarm Fatigue on the Work of Nurses in an Intensive Care Environment-A Systematic Review. International journal of environmental research and public health, 17(22), 8409.

Li, T., Matsushima, M., Timpson, W., Young, S., Miedema, D., Gupta, M., & Heldt, T. (2018). Epidemiology of patient monitoring alarms in the neonatal intensive care unit. Journal of Perinatology, 38(8), 1030-1038.

National Fire Protection Agency (2021). Smoke Alarms in US Home Fires. Retrieved from

Nielsen, J., and Levy, J. (1994). Measuring usability — preference vs. performance. Communications of the ACM 37, 4 (April), 66–75.

Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on systems, man, and cybernetics-Part A: Systems and Humans, 30(3), 286-297.,%20Sheridan,%20Wickens_2000.pdf

Rice, S. (2009). Examining single and multiple-process theories of trust in automation. Journal of General Psychology, 136(3), 303-319.

Rice, S. C., Mehta, R., Winter, S., & Oyman, K. (2015). A trustworthiness of commercial airline pilots (T-CAP) scale for American consumers. Journal of Aviation Technology and Engineering, 4(2), 55.

Rice, S., Mehta, R., Steelman, L. A., & Winter, S. R. (2014). A trustworthiness of commercial airline pilots (T-CAP) scale for Indian consumers. International Journal of Aviation, Aeronautics, and Aerospace, 1(3), 3.

Roenneberg, Till. Internal Time : Chronotypes, Social Jet Lag, and Why You're So Tired, Harvard University Press, 2012. ProQuest Ebook Central,

Ruskin, K. J., & Hueske-Kraus, D. (2015). Alarm fatigue: impacts on patient safety. Current Opinion in Anesthesiology, 28(6), 685-690.

Singh, I. L. Molloy, R., & Parasuraman, R. (1993). Automation-induced “complacency”: Development of the Complacency-Potential Rating Scale. The International Journal of Aviation Psychology, 3(2), 111-122

Taylor, J. R. I., & Wogalter, M. S. (2012). Acceptability of Evacuation Instruction Fire Warnings. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 56(1), 1753–1757.

Thomas, K. A., & Clifford, S. (2017). Validity and mechanical turk: An assessment of exclusion methods and interactive experiments. Computers in Human Behavior, 77, 184-197.

Wickens, C. D., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A synthesis of the literature. Theoretical Issues in Ergonomics Science, 8(3), 201-212.