Search EdWorkingPapers

Search EdWorkingPapers by author, title, or keywords.

Bridging human and machine scoring in experimental assessments of writing: tools, tips, and lessons learned from a field trial in education

In a randomized trial that collects text as an outcome, traditional approaches for assessing treatment impact require that each document first be manually coded for constructs of interest by human raters. An impact analysis can then be conducted to compare treatment and control groups, using the hand-coded scores as a measured outcome. This process is both time and labor-intensive, which creates a persistent barrier for large-scale assessments of text. Furthermore, enriching ones understanding of a found impact on text outcomes via secondary analyses can be difficult without additional scoring efforts. Machine-based text analytic and data mining tools offer one potential avenue to help facilitate research in this domain. For instance, we could augment a traditional impact analysis that examines a single human-coded outcome with a suite of automatically generated secondary outcomes. By analyzing impacts across a wide array of text-based features, we can then explore what an overall change signifies, in terms of how the text has evolved due to treatment. In this paper, we propose several different methods for supplementary analysis in this spirit. We then present a case study of using these methods to enrich an evaluation of a classroom intervention on young children’s writing. We argue that our rich array of findings move us from “it worked” to “it worked because” by revealing how observed improvements in writing were likely due, in part, to the students having learned to marshal evidence and speak with more authority. Relying exclusively on human scoring, by contrast, is a lost opportunity.

Keywords
text analysis, randomized controlled trial, automated scoring, argumentative writing
Education level
Document Object Identifier (DOI)
10.26300/ecs1-1n25

EdWorkingPaper suggested citation:

Mozer, Reagan, Luke Miratrix, Jackie Eunjung Relyea, and James S. Kim. (). Bridging human and machine scoring in experimental assessments of writing: tools, tips, and lessons learned from a field trial in education. (EdWorkingPaper: 21-493). Retrieved from Annenberg Institute at Brown University: https://doi.org/10.26300/ecs1-1n25

Machine-readable bibliographic record: RIS, BibTeX