Methodology

Background

The UCLA Law COVID Behind Bars Data Project, launched in March 2020, tracks the spread and impact of the novel coronavirus in American carceral facilities and advocates for greater transparency and accountability around the pandemic response of the carceral system.

Our team of over 100 staff and volunteer researchers gathers and presents data about COVID-19 in prisons, jails, youth facilities, and immigration detention centers across the United States. We also collect information about pandemic-related prison and jail releases, legal filings and court orders bearing on the safety of incarcerated people, and grassroots organizing campaigns and fundraisers.

Since the beginning of the pandemic, the U.S. public’s ability to assess the extent and impact of viral spread has been limited by shortcomings in data reporting. Data concerning America’s carceral institutions — which confine over two million people — is particularly hard to come by, and this opacity costs lives. The crowding and subpar healthcare systems in carceral facilities make them hotspots for viral spread, and the people who work and are incarcerated in them do not have the option to socially distance. Epidemiological data reported in these settings are limited, often vaguely defined, and difficult to compare between jurisdictions. To save lives, advocates and policymakers need data to demonstrate and appreciate the urgency of this public health crisis and the need for significant decarceration to limit the spread and protect those who are medically vulnerable.

In an effort to make publicly accessible the limited data that correctional agencies report, the UCLA Law COVID Behind Bars Data Project collects and centralizes detailed, facility-level data on COVID-19 infections and related deaths for incarcerated persons and staff in the U.S. The project also gathers other data, such as regarding population levels in carceral facilities, which are critical to contextualizing COVID-19 infection numbers and do not exist elsewhere in a unified dataset. Our methods for collecting and reporting data on prisons and jails differ from those used for data on immigration and youth facilities; these methods are explained separately below.

Prison and jail data overview

We primarily collect data from federal, state, and local correctional agency websites using web scraping programs we developed to automatically pull reported data three to four times per week.

Our core dataset includes:

  • the cumulative number of infections among both incarcerated people and staff;
  • the cumulative number of deaths among both incarcerated people and staff;
  • and the cumulative number of tests among both incarcerated people and staff.

However, correctional authorities vary dramatically in what they report publicly. For example, as of December 2020, the Pennsylvania Department of Corrections reports data for all six of these core variables, while the Mississippi Department of Corrections reports data for only one.

Further, we aim to collect and report COVID-19 facility-level data, where available, from all federal, state, and county correctional agencies across the country. In general, the Federal Bureau of Prisons reports its COVID-19 prison data by facility. However, not all state and county jurisdictions report data disaggregated by facility for all variables. We are working on a detailed table of variable availability by jurisdiction. In the meantime, please consult the individual state pages to see where we have facility-level data.

There have been instances when the values reported by an agency changed over time in ways that were unexpected based on the description of the variable. We actively work to address these issues as they arise, at times adjusting our dataset as we learn the nuances of each agency’s reporting.

When data are not available publicly, we make every effort to obtain missing information through original public records requests. In some cases, we also partner with other organizations who gather data directly from agencies. We compile our data into a spreadsheet we maintain on GitHub. Also available on GitHub is our R package behindbarstools, which includes a variety of functions to help pull, clean, wrangle, and visualize our data.

We also maintain a historical dataset that includes all data we have collected since the start of the pandemic. We are currently working to clean reporting inconsistencies in that data to enable us to display time series visualizations.

Core prison and jail variables: definitions and considerations

Active Cases

  • Residents.Active: the number of individuals currently incarcerated at a facility who have an active infection of COVID-19 and have not been deemed recovered.
  • Staff.Active: the number of individuals currently working at the facility who have an active infection of COVID-19 and have not been medically cleared to return to work.
  • Considerations: Though testing data is not always made available, we know that testing practices vary widely by correctional agency. As a result, true case counts are likely higher than reported, and the extent of this underdetection is extremely variable.

    Finally, not all agencies report data on staff COVID-19 cases. Some jurisdictions leave it to staff members’ discretion whether to report positive test results they receive from community healthcare providers. As a result, the number of staff cases reported may be lower even than the number detected by testing.

Cumulative Cases

  • Residents.Confirmed: The cumulative number of confirmed COVID-19 cases among persons incarcerated at a facility.
  • Staff.Confirmed: The cumulative number of confirmed COVID-19 cases among staff members at a facility.
  • Considerations: In many cases, agencies consider “cumulative cases” to mean the number of individuals who were ever at a facility who have ever been infected with COVID-19. However, some agencies, such as the Ohio Department of Rehabilitation and Correction, consider “cumulative cases” to mean the number of individuals who currently reside in a facility who have ever been infected with COVID-19. Therefore, the reported facility-level value can decrease as people are transferred or released. For more details on methodology by state, see the section “Notes about Individual States” in our data dictionary.

    Another important consideration is that, though testing data is not always made available, we know that testing practices vary widely by correctional agency. As a result, true case counts are likely higher than reported, and the extent of this underdetection is extremely variable.

    Finally, not all agencies report data on staff COVID-19 cases. Some jurisdictions leave it to staff members’ discretion whether to report positive test results they receive from community healthcare providers. As a result, the number of staff cases reported may be lower even than the number detected by testing.

Deaths

  • Residents.Deaths: The cumulative number of incarcerated persons who have died with or from COVID-19 while in custody of a facility.
  • Staff.Deaths: The cumulative number of staff who have died from COVID-19 while employed by a facility.
  • Considerations: Many correctional agencies only report death figures in aggregate, rather than by facility. Furthermore, agencies differ in the categories of deaths they report as COVID-19-related. Some agencies include all deaths suspected of being related to COVID-19, and some include only those with a positive test result. Some jurisdictions do not include deaths that occur after or while someone has COVID-19 if the medical provider or examiner declares an alternative cause of death, such as a heart attack. There have also been instances where jurisdictions have not counted the deaths of people who died of COVID-19 after being released, even when they contracted the virus inside a facility. We are currently investigating suspected undercounts in COVID-19-related deaths. If a news source reports deaths that a correctional agency has not reported, we add the deaths to our data count. When this occurs, we make a note of it in the Add’l.Notes column. This is the only instance in which we enter information from a news source that might conflict with an agency’s self-reported data.

    Finally, not all agencies report data on staff deaths related to COVID-19.

Tests

  • Residents.Tested: The cumulative number of incarcerated persons tested for COVID-19, or the cumulative number of tests performed on incarcerated persons.
  • Staff.Tested: The cumulative number of staff tested for COVID-19, or the cumulative number of tests performed on staff.
  • Considerations: As noted in the definitions above, some agencies report the number of persons tested, while others report the number of tests administered. It is not always clear which is being reported; we record whichever number is available and make a note in our dataset where possible. A link to information directly from the agency about how it reports testing data can be found in the “Sources” column of our dataset on GitHub.

How We Calculate Rates of Infection and Death Among Incarcerated People in Prisons and Jails

Our website displays approximate rates per population for cumulative cases, active cases, and deaths among incarcerated people at the facility level. Rates provide necessary context for understanding the severity of the COVID-19 situation within a particular facility or jurisdiction. For example, 600 cases in a facility detaining 1,000 people represent more significant viral spread than do the same number of cases in a facility detaining 5,000 people.

We calculate these rates by dividing the total number of infections or deaths at a facility (numerator) by the population of people incarcerated at that facility (denominator). On each state page, we also aggregate the numerators and denominators for all federal, state, and county facilities to approximate rates by jurisdiction type within a particular state.

We collected our population data from agency websites and through public records requests to correctional agencies; we filled in gaps with the Homeland Infrastructure Foundation-Level Data (HIFLD) Prison Boundaries dataset maintained by the Department of Homeland Security, which contains population estimates for approximately 6,700 facilities. We are currently in the process of adding all population data and sources to our dataset on GitHub.

Because the majority of correctional agencies do not regularly report up-to-date facility-level population figures and we are reporting cumulative counts since the start of the pandemic, we chose to align our population denominators with data reflecting that point in time where possible: the most recently reported facility populations as of February 2020. These figures do not account for subsequent releases, intakes, and movement between facilities.

We use the last reported facility population total for a date prior to the end of February 2020. However, some agencies have not updated their population level reports since December 2019. For facilities where we relied on HIFLD data, the figures may be up to five years old.

Because the data we collect are reported in aggregate rather than at the individual case level, turnover in prison and jail populations makes it impossible for us to determine whether people who have been infected with COVID-19 at some point during the pandemic are still among those incarcerated (and, if so, whether they remain at the same facility). As a result, the universe of people for the infection and death counts (numerators) is not the same as the universe for the population counts (denominators). Eventually, at some facilities with large outbreaks, this may mean that numerators exceed denominators.

We are currenting gathering more detailed time-series population data from many jurisdictions and will make that data available in the future.

We do not have reliable data for staffing levels at all facilities. It is also difficult to determine who each agency includes in its definition of “staff,” how many staff are actually present in each facility (versus on leave or working in administrative offices), and whether staff work within one or multiple facilities. Due to these complexities, we are not currently providing rates for staff, but we hope to do so in the future.

How to Cite Our Prison and Jail Data

Citations for academic publications and research reports:

Sharon Dolovich, Aaron Littman, Kalind Parish, Grace DiLaura, Chase Hommeyer, Michael Everett, Hope Johnson, Neal Marquez, and Erika Tyagi. UCLA Law COVID Behind Bars Data Project: Prison/Jail Cases and Deaths Dataset [date you downloaded the data]. UCLA School of Law, uclacovidbehindbars.org.

Citations for media outlets, policy briefs, and online resources:

UCLA Law COVID Behind Bars Data Project, uclacovidbehindbars.org.

Data licensing:

Our data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. That means that you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may not use our data for commercial purposes, which means anything primarily intended for or directed toward commercial advantage or monetary compensation.

Our project collaborates with Bronx Defenders, Columbia Law School’s Center for Institutional and Social Change, and Zealous to collect legal documents from around the country related to COVID-19 and incarceration. Together, we then organize and code them into the jointly managed Health is Justice litigation hub for public defenders, litigators, and other advocates. The majority of the legal documents in the Health is Justice litigation hub are federal court opinions, but we are expanding to state legal filings, declarations, and exhibits.

In addition to the Health is Justice litigation hub, our project also manages additional data self-reported by advocates regarding COVID-19-related legal filings involving incarcerated youth (via this form) and individuals in immigration detention (via this form).

Our data reflect only a subset of filings and are by no means exhaustive.

Data on Prison and Jail Releases

We collect data on jurisdictions across the U.S. that have released people from adult prison and jail custody in response to the COVID-19 pandemic. Variables we collect include the official or agency authorizing release, the number of people released, and details on the conditions for release, among others. Our data collection relies on published articles, reports, and data demonstrating realized and pandemic-related releases. We also perform strategic outreach to various legal and advocacy groups to request pertinent data via this form. For the most part, we only include release efforts where the data source includes some sort of programmatic description of who is being released (e.g., people with technical violations of parole, people charged with non-violent crimes, etc.). Though we maintain the most complete dataset of COVID-19-related release efforts, it reflects only a subset of efforts and is by no means exhaustive.

Data on Immigration Detention

We scrape the Immigration and Customs Enforcement (ICE) website daily to collect data relating to COVID-19 infections and deaths of detainees and staff within all 120 ICE facilities, as well as other facilities detaining people under ICE jurisdiction, across the United States. ICE only provides a live dataset, meaning that when new data is published, it replaces the previously published data without creating an archive. Therefore, we maintain a historical dataset and merge in newly scraped data as we collect it so that we can understand how the pandemic has unfolded in ICE facilities over time. ICE first started disclosing COVID-19 cases in detention facilities on March 26, 2020, and the agency stopped providing updates regarding COVID-19-related cases and deaths among staff on June 18, 2020.

Data on Grassroots and Community Organizing Efforts

Our team collects data on grassroots and community organizing efforts by incarcerated people, their families, community-based organizations, nonprofits, and advocates aimed at influencing government agencies to protect the lives of people incarcerated in prisons, jails, and detention centers against the threats posed by COVID-19.

In the context of this project, we define grassroots organizing as efforts and actions planned by, for, and with incarcerated people. We define community organizing as efforts and actions planned by community-based organizations.

We do not include efforts for which we have no basis to believe that the action was connected to a health and safety risk posed by COVID-19 in a carceral facility, or efforts that are too individualized to be considered an organizing effort.

The Perilous Chronicle’s List of Prisoner Actions is a vital source for our data, as are social media platforms such as Twitter, where organizers often promote their efforts. For example, our team has learned about many efforts by following the hashtag #FreeThemAll, commonly used to amplify demands by organizers inside and outside facilities to release incarcerated individuals to prevent COVID-19 outbreaks. We also gather data from news reports and social media to enhance our existing understanding of efforts happening on the ground. Finally, we ask organizers to self-report their efforts through this form. Our grassroots and community organizing data reflect only a subset of efforts and are by no means exhaustive.