[1]宋丽红,汪文义,丁树良.标准参照测验分数报告中子分数估计方法[J].江西师范大学学报(自然科学版),2020,(03):292-300.[doi:10.16357/j.cnki.issn1000-5862.2020.03.13]
 SONG Lihong,WANG Wenyi,DING Shuliang.The Subscore Estimation Methods for Score Reports in Criterion-Referenced Tests[J].Journal of Jiangxi Normal University:Natural Science Edition,2020,(03):292-300.[doi:10.16357/j.cnki.issn1000-5862.2020.03.13]
点击复制

标准参照测验分数报告中子分数估计方法()
分享到:

《江西师范大学学报》(自然科学版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2020年03期
页码:
292-300
栏目:
信息科学与技术
出版日期:
2020-06-10

文章信息/Info

Title:
The Subscore Estimation Methods for Score Reports in Criterion-Referenced Tests
文章编号:
1000-5862(2020)03-0292-09
作者:
宋丽红1汪文义2丁树良2
1.江西师范大学初等教育学院,江西 南昌 330022; 2.江西师范大学计算机信息工程学院,江西 南昌 330022
Author(s):
SONG Lihong1WANG Wenyi2DING Shuliang2
1.Elementary Education College,Jiangxi Normal University,Nanchang Jiangxi 330022,China; 2.College of Computer Information Engineering,Jiangxi Normal University,Nanchang Jiangxi 330022,China
关键词:
子分数 目标表现指数方法 增广分数法 回归法 项目反应理论模型
Keywords:
subscore objective performance index augmented scores regression method item response model
分类号:
B 841
DOI:
10.16357/j.cnki.issn1000-5862.2020.03.13
文献标志码:
A
摘要:
标准参照测验采用子分数衡量学生在具体内容、知识或技能上的掌握情况,这有助于发挥考试的学习功能.在少量题目上作答数据估计子分数,其信度较难保证.子分数估计方法可充分利用相关辅助信息获得信度较高的子分数,这对补救教学至关重要.在简要介绍测量模型之后,叙述了子分数的7类估计方法的思想和计算过程,并分析了各方法的应用和表现,需重点关注群体和个体、复杂结构、优化测验设计和其他施测方式下子分数估计.
Abstract:
Criterion-referenced tests focus on mastery statuses of students in different content,knowledge or skill areas by subscore,which can be beneficial for playing the learning function of the examination.Subscore only estimated from a small number of test items makes it difficult to establish high reliability.Subscore estimation methods can make full use of auxiliary information on associated test data to obtain subscore with higher reliability,which is crucial to remedial teaching.After a review of measurement models is briefly introduced,the seven subscore estimation methods and their detailed computational procedures are explained.And the application and performance of each method are analyzed.Finally,subscore estimation of group-levels and individual-levels or under complex structures,optimal test design,and test administration needs to be focused on.

参考文献/References:

[1] 戴海琦.心理测量学[M].北京:高等教育出版社,2010.
[2] 甘良梅,余嘉元.标准参照测验分数体系的探讨研究[J].心理学探新,2006,26(3):79-83.
[3] 辛涛,李勉,任晓琼.基础教育质量监测报告撰写与结果应用[M].北京:北京师范大学出版集团,2015.
[4] Jiang Yu,Zhang Jiahui,Xin Tao.Toward education quality improvement in China:a brief overview of the national assessment of education quality[J].Journal of Educational and Behavioral Statistics,2019,44(6):733-751.
[5] Carroll P E,Bailey A L.Do decision rules matter?A descriptive study of english language proficiency assessment classifications for english-language learners and native english speakers in fifth grade[J].Language Testing,2016,33(1):23-52.
[6] Douglas K M,Mislevy R J.Estimating classification accuracy for complex decision rules based on multiple scores[J].Journal of Educational and Behavioral Statistics,2010,35(3):280-306.
[7] Fein M.Test development:fundamentals for certification and evaluation[M].Danvers:ASTD Press,2012.
[8] 汪存友,余嘉元.标准参照测验及格线设定研究中的模拟实验法[J].心理学探新,2009,29(2):81-85.
[9] Bock R D,Thissen D,Zimowski M F.IRT estimation of domain scores[J].Journal of Educational Measurement,1997,34(3):197-211.
[10] Pommerich M,Nicewander W A.Estimating average domain scores[J].Journal of Educational Measurement,1999,36(3):199-216.
[11] Fu Jianbin,Qu Yanxuan.A review of subscore estimation methods[EB/OL].[2019-10-13].https://onlinelibrary.wiley.com/doi/pdf/10.1002/ets2.12203.
[12] Wainer H,Sheehan K M,Wang Xiaohui.Some paths toward making praxis scores more useful[J].Journal of Educational Measurement,2000,37(2):113-140.
[13] Welborn C A,Lester D,Parnell J.Using act subscores to identify at risk students in business statistics and principles of management courses[J].Journal of Education for Business,2015,90(6):328-334.
[14] Reckase M D,Xu Jingru.The evidence for a subscore structure in a test of english language competency for english language learners[J].Educational and Psychological Measurement,2015,75(5):805-825.
[15] Yao Lihua.Reporting valid and reliable overall scores and domain scores[J].Journal of Educational Measurement,2010,47(3):39-360.
[16] de la Torre J,Song Hao,Hong Yuan.A comparison of four methods of IRT subscoring[J].Applied Measurement in Education,2011,35(4):296-316.
[17] Sinharay S,Puhan G,Haberman S J.An NCME instructional module on subscores[J].Educational Measurement: Issues and Practice,2011,30(3):29-40.
[18] Sinharay S.Added value of subscores and hypothesis testing[J].Journal of Educational and Behavioral Statistics,2019,44(1):25-44.
[19] Yen W M.A bayesian/IRT index of objective performance[EB/OL].[2019-10-13].http://www.ets.org/Media/Research/pdf/Yen_OPI_1987.pdf.
[20] Wainer H,Vevea J L,Camacho F,et al.Augmented scores-"borrowing strength" to compute scores based on small numbers of items[M]∥Thissen D,Wainer H.Test scoring.Mahwah,NJ:Lawrence Erlbaum Associates,Inc,2001:343-387.
[21] Haberman S J.When can subscores have value?[J].Journal of Educational and Behavioral Statistics,2008,33(2):204-229.
[22] Liu Yue,Li Zhen,Liu Hongyun.Reporting valid and reliable overall scores and domain scores using bi-factor model[J].Applied Psychological Measurement,2018,43(7):1-15.
[23] Reckase M D.Multidimensional item response theory[M].New York:Springer,2009.
[24] de la Torre J,Song Hao.Simultaneous estimation of overall and domain abilities:a higher-order irt model approach[J].Applied Measurement in Education,2009,33(8):620-639.
[25] Yao Lihua,Boughton K A.A multidimensional item response modeling approach for improving subscale proficiency estimation and classification[J].Applied Psychological Measurement,2007,31(2):1-23.
[26] 马世晔,章建石.基于考试结果挖掘的教育评价:理论与实践[J].心理学探新,2012,32(5):461-465.
[27] Liu Ren,Qian Hong,Luo Xiao,et al.Relative diagnostic profile:a subscore reporting framework[J].Educational and Psychological Measurement,2018,78(6):1072-1088.
[28] 康春花,杨亚坤,曾平飞.海明距离判别法分类准确率的影响因素[J].江西师范大学学报:自然科学版,2017,41(4):394-400.
[29] 罗慧,熊建华,王晓庆,等.基于加权距离的一种认知诊断方法[J].江西师范大学学报:自然科学版,2018,42(1):74-81,88.
[30] Thissen D,Wainer H.Test scoring[M].Mahwah,NJ:Lawrence Erlbaum Associates,Inc,2001.
[31] 张尧庭,方开泰.多元统计分析引论[M].武汉:武汉大学出版社,2013.
[32] de la Torre J,Patz R J.Making the most of what we have:a practical application of multidimensional item response theory in test scoring[J].Journal of Educational and Behavioral Statistics,2005,30(3):295-311.
[33] Haberman S J,Sinharay S.Reporting of subscores using multidimensional item response theory[J].Psychometrika,2010,75(2):209-227.
[34] Haberman S,Sinharay S,Puhan G.Reporting subscores for institutions[J].The British Journal of Mathematical and Statistical Psychology,2009,62(1):79-95.
[35] 辛涛,谢敏.群体水平领域分数及其估计方法[J].心理发展与教育,2010,26(4):416-422.
[36] 辛涛,谢敏.矩阵取样设计中群体水平领域分数估计方法的精确性比较研究初探[J].中国考试:评价与测量,2011(5):3-12.
[37] 姚建欣,郭玉英.为学生认知发展建模:学习进阶十年研究回顾及展望[J].教育学报,2014,10(5):35-42.
[38] DeCarlo L T.On the analysis of fraction subtraction data:the DINA model,classification,latent class sizes,and the Q-matrix[J].Applied Psychological Measurement,2011,35(1):8-26.
[39] 王孟成,毕向阳.回归混合模型:方法进展与软件实现[J].心理科学进展,2018,26(12):2272-2280.
[40] Briggs D C,Alonzo A C.The psychometric modeling of ordered multiple-choice item responses for diagnostic assessment with a learning progression[C]∥Alonzo A C,Gotwals A W.Learning progressions in science: Current challenges and future directions.Rotterdam,The Netherlands:Sense Publishers,2012:293-316.
[41] 高一珠,陈孚,辛涛,等.心理测量学模型在学习进阶中的应用:理论、途径和突破[J].心理科学进展,2017,25(9):1623-1630.
[42] 宋丽红,汪文义,戴海琦,等.基于贝叶斯网的认知诊断模型构建[J].心理科学,2016,39(4):783-789.
[43] 喻晓锋,丁树良,秦春影,等.贝叶斯网在认知诊断属性层级结构确定中的应用[J].心理学报,2011,43(3):338-346.
[44] Zhan Peida,Ma Wenchao,Jiao Hong,et al.A sequential higher order latent structural model for hierarchical attributes in cognitive diagnostic assessments[J].Applied Psychological Measurement,2019,44(1):1-19.
[45] Wainer H,Feinberg R.For want of a nail:why unnecessarily long tests may be impeding the progress of western civilisation?[J].Significance,2015,12(1):16-21.
[46] Feinberg R A,Wainer H.When can we improve subscores by making them shorter the case against subscores with overlapping items[J].Educational Measurement:Issues and Practice,2014,33(3):47-54.

备注/Memo

备注/Memo:
收稿日期:2019-05-18
基金项目:全国教育科学规划教育部重点课题“基础教育质量监测分数报告方法研究”(DHA150285)资助项目.
作者简介:宋丽红(1981-),女,江西新干人,副教授,博士,主要从事教育测量研究.E-mail:viviansong1981@163.com
更新日期/Last Update: 2020-06-10