Despite calls for more evaluative research in teacher education, formal assessments of the effectiveness of novel teacher education practices remain rare. One reason is that we lack designs and measurement approaches that appropriately meet the challenges of causal inference in the field. In this article, we seek to fill this gap. We first outline the difficulties of doing evaluative work in teacher education. We then describe a set of replicable practices for developing measures of key teaching outcomes, and propose evaluative research designs that can be adapted to suit the needs of the field. Finally, we identify community-wide initiatives that are necessary to advance useful evaluative research.