When Conservatives See Red but Liberals Feel Blue: Labeler Characteristics and Variation in Content Annotation

Nora Webb Williams, Andreu Casas Salleras, Kevin Aslett, John Wilkerson

Research output: Contribution to journalArticlepeer-review

Abstract


Human annotation of data, including texts and images, is a bedrock of political science research. Yet we often fail to consider how the identities of our labelers may systematically affect their annotations and our downstream applications. Collecting annotator demographic information, regardless of task type, can help us establish measurement validity and better appreciate variation in inter-rater reliability. We may also discover things about our topic that we did not previously appreciate. We demonstrate the benefits of collecting labeler characteristics with two annotation cases, one using images from the United States and the second using text from the Netherlands. For both cases on a range of tasks, we find that annotator gender and political identity are associated with significantly different annotations. We consider three main approaches to addressing labeler characteristic issues: adjusting labels based on labeler identity, weighting composite labels based on target population demographics, and intentionally modeling subgroup variation.
Original languageEnglish
JournalThe Journal of Politics
DOIs
Publication statusAccepted/In press - 11 Feb 2025

Cite this