Skip to main content

References & Citation

References

Educational Assessment Literature

[2] Anderson, L. W., & Krathwohl, D. R. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives. Addison Wesley Longman, Inc.

[8] Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: The consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10(2), 133–143. https://doi.org/10.1007/s10459-004-4019-5

[9] Gierl, M. J., Bulut, O., Guo, Q., & Zhang, X. (2017). Developing, analyzing, and using distractors for multiple-choice tests in education: A comprehensive review. Review of Educational Research, 87(6), 1082–1116. https://doi.org/10.3102/0034654317726529

[10] Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309–333. https://doi.org/10.1207/S15324818AME1503_5

[11] Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge. https://doi.org/10.4324/9780203850381

[17] Sireci, S. G. (1998). The construct of content validity. Social Indicators Research, 45(1-3), 83–117.

[18] Applegate, B., Jensen, J. L., Morphew, J. W., & Blackman, A. (2019). The effect of option homogeneity on multiple-choice item functioning. Educational and Psychological Measurement, 79(2), 341–364. https://doi.org/10.1177/0013164418783517


Citation

If you use this framework in your research, please cite:

@software{quiz_benchmark_2024,
title = {AI Quiz Generation Benchmark: A Framework for Evaluating AI-Generated Educational Assessments},
author = {[Your Name/Team]},
year = {2024},
version = {1.0.0-beta},
url = {[Your Repository URL]},
note = {Research software under active development}
}

License

[Specify your license here]


Acknowledgments

This framework builds on established principles from educational measurement and leverages modern LLM capabilities for automated assessment. We acknowledge the foundational work in educational assessment literature that informs our quality metrics.


Status: 🚧 ACTIVE DEVELOPMENT — Framework operational, metrics under validation

Version: 1.0.0-beta

Last Updated: February 2024