参考文献:
[1]American Educational Research Association, American Psychological Association,& National Council on Measurement in Education. ( 1999 ),Standards for educational and psychological testing,Washington,DC: American Educational Research Association.
[2]Attali, Y.,& Burstein, J. (2006),Automated scoring with e-rater v. 2.0.Journal of Technology, Learning, and Assessment,4(3), 1-30.
[3]Baldwin,D.,Fowles,M.&Livingston,S.(2005),Guidelines for constructed-response and other performance assessments.Princeton,NJ:Educational Testing Service.
[4]Bennett, R. E.,Morley,M.,Quardt, D.,& Rock, D.A. ( 1990 ),Graphical modeling: A new response type for measuring the qualitative component of mathematical reasoning(ETS RR - 99 - 21 ), Princeton, NJ: Educational Testing Service.
[5]Center for Universal Design (1997),The Principles of Universal Design,Version 2. 0. Raleigh, NC: North Carolina State University.
[6 ] Downing, S. M. ( 2006 ), Selected-response item formats in test development.In S. M. Downing & T. M.Haladyna (Eds. ),Handbook of test development(pp. 287 -301),Mahwah,NJ:Lawrence Erlbaum.
[7]Downing, S.M.,&Haladyna,T.M. (1997),Test item development:Validity evidence from quality assurance procedures.Applied Measurement in Education, 10(1), 61-82.
[8] DuBois, P. H. (1970 ),A history of psychological testing.Boston,MA:Allyn& Bacon.
[9]Ebe,l R. L. (1951).Writing the test item. In E. F. Lindquist (Ed. ),Educational measurement (1st ed., pp. 185-249).Washington DC:American Council on Education.
[10]Elliott, S. N., Kettler, R. J., Beddow, P. A., Kurz, A., Compton, E., McGrath, D., Bruen, C., Hinton, K.,Palmer, P.,Rodriguez,M., Bolt, D.,& Roach,A. T. (2010),Effects of using modified items to test students with persistent academic difficulties,Exceptional Children, 76(4), 475-495.
[11]Ferrara, S.,& DeMauro,G.E. (2006),Standardized assessment of individual achievement in K-12.In R.L.Brennan (Ed. ),Educational Measurement ( 4th ed., pp. 324 -),Westport,CT: Praeger Publishers.
[12 ] Haladyna, T. M. ( 1997 ),Writing test items to evaluate higher order thinking.Boston:Allyn& Bacon.
[13]Haladyna, T.M. (2004),Developing and validating multiple-choice test items(3rd ed. ).Mahwah, NJ: Lawrence Erlbaum.
[14]Haladyna, T. M., & Downing, S. M. (1989a),A taxonomy of multiple-choice item-writing rules.Applied Measurement in Education, 1, 37–50.
[15]Haladyna, T.M.,& Downing, S.M. (1989b),The validity of a taxonomy of multiple-choice item-writing rules.Applied Measurement in Education, 1, 51-78.
[16]Haladyna,T.M., Downing, S.M.,& Rodriguez,M.C. (2002),A review of multiple-choice item-writing guidelines for classroom assessment,Applied Measurement in Education, 15(3), 309-334.
[17] Kettler, R. J., Elliott, S. N., & Beddow, P. A.(2009),Modifying achievement test items:A theory-guided and data-based approach for better measurement of what students with disabilities know,Peabody Journal ofEducation, 84, 529-551.
[18 ] Kettler, R. J., Rodriguez, M. R., Bolt, D. M.,Elliott, S.N., Beddow, P.A.,& Kurz,A. ( in press),Modified multiple-choice items for alternate assessments: Reliability,difficulty, and the interaction paradigm.Applied Measurement in Education.
[ 19 ] Minnesota Department of Education ( 2009 ),Minnesota Comprehensive Assessments Series II (MCA-II):Test Specifications for Reading. Roseville, MN: Author.Retrieved online at http: //education. state. mn. us/mdeprod/groups/Assessment/documents/Report/006367. pdf
[ ] National Center on Universal Design for Learning (2011),Universal design for learning guidelines, version 2.0,Wakefield,MA:CAST.
[20]Osterlind, S. J.,&Merz,W. R. (1994),Building a taxonomy for constructed-response test items,Educational Assessment, 2(2), 133-147.
[21]Rodriguez,M. C. (2005),Three options are optimal for multiple-choice items:A meta-analysis of 80 years of research,Educational Measurement: Issues and Practice, 24(2), 3-13.
[22]Rodriguez, M. C. ( 2009 ),Psychometric considerations for alternate assessments based on modified academic achievement standards,Peabody Journal of Education,84, 595-602.
[23] Schmeiser, C. B., & Welch, C. J. (2006), Test development. In R. L. Brennan ( Ed. ),Educational Measurement(4th ed., pp. 324 -), Westport, CT: Praeger Publishers.
[24] Shanedling, J., Van Heest, A., Rodriguez,M. C.,Putnam, M., Age,l J. ( 2010 ),Validation of an online assessment of orthopedic surgery residents cognitive skills and preparedness for carpal tunnel release surgery,Journal of Graduate Medical Education, 2(3), 435-441.
[25]Sirec,i S. G.,& Zenisky, A. L. (2006), Innovative item formats in computer-based testing: In pursuit of improved construct representation, In S.M. Downing& T.M. Haladyna (Eds. ),Handbook of test development( pp. 329 - 347 ). Mahwah,NJ:Lawrence Erlbaum.
[26]Thompson,S.,&Thurlow,M.(2002),Universally designed assessments:Better tests for everyone!(PolicyDirections No. 14). Minneapolis, MN: University of Minnesota, National Center on Educational Out comes.
[27]Welch, C. (2006), Item and prompt development in performance testing, In S. M. Downing & T. M. Haladyna (Eds. ),Handbook of test development( pp. 303 - 327 ),
Mahwah,NJ:Lawrence Erlbaum.
Feng Jianmin
Institute of Education,Xiamen University,Xiamen,Fujian,361005
:Imperial Examination was founded in the Sui-tang dynasty, perfected in the Song-yuan dynasty, prospered in the Ming-ching dynasty and abolished in the end of Qing dynasty. Ithad lasted for more than one thousand years and dominated the center of political activities and social activities of ancient China. Imperial Examination,which is an examination system aiming to select nation-governing talents, has long histories, perfect setups and stable patterns. The system, with powerful social functions, touched every corner of the society and had a controlling impacton feudal political system, educational system and etiquette& custom system.
:Imperial Examination, Social System,Controlling