Books shape how children learn about society and norms, in part through representation of different characters. We introduce new artificial intelligence methods for systematically converting images into data and apply them, along with text analysis methods, to measure the representation of race, gender, and age in award-winning children’s books from the past century. We find that more characters with darker skin color appear over time, but the most influential books persistently depict a greater proportion of light-skinned characters than other books, even after conditioning on race; we also find that children are depicted with lighter skin than adults. Relative to their growing share of the U.S. population, Black and Latinx people are underrepresented in these same books, while White males are overrepresented. Over time, females are increasingly present but appear less often in text than in images, suggesting greater symbolic inclusion in pictures than substantive inclusion in stories. We then report empirical evidence for predictions about the supply of and demand for representation that would generate these patterns. On the demand side, we show that people consume books that center their own identities. On the supply side, we document higher prices for books that center non-dominant social identities and fewer copies of these books in libraries that serve predominantly White communities. Lastly, we show that the types of children’s books purchased in a neighborhood are related to local political beliefs.
representation, images as data, curriculum, children, education, libraries, race, gender
Document Object Identifier (DOI)