Jumpstarting the Justice Disciplines: A Computational-Qualitative Approach to Collecting and Analyzing Text and Image Data in Criminology and Criminal Justice Studies


Computational methods are increasingly popular in criminal justice research. As more criminal justice data becomes available in ‘big’ and other digital formats, new means of embracing the computational turn are needed. In this article, we propose a framework for data collection and case sampling using computational methods, allowing researchers to conduct thick qualitative research – analyses concerned with the particularities of a social context or phenomenon – starting from big data, which is typically associated with thinner quantitative methods and the pursuit of generalizable findings. The approach begins by using open-source web scraping algorithms to collect content from a target website, online database, or comparable online source. Next, researchers use computational techniques from the field of natural language processing to explore themes and patterns in the larger data set. Based on these initial explorations, researchers algorithmically generate a subset of data for in-depth qualitative analysis. In this computationally driven process of data collection and case sampling, the larger corpus and subset are never entirely divorced, a feature we argue has implications for traditional qualitative research techniques and tenets. To illustrate this approach, we collect, subset, and analyze three years of news releases from the Royal Canadian Mounted Police website (N=13,637) using a mix of web scraping, natural language processing, and visual discourse analysis. To enhance the pedagogical value of our intervention and facilitate replication and secondary analysis, we make all data and code available online in the form of a detailed, step-by-step tutorial.

Journal of Criminal Justice Education