Data Resources for Research

There are a number of datasets available for social scientists to use for research and analysis. This resource guide provides an overview of available datasets, research tools, and computing resources. If you would like to add your resource to this list, please email

Datasets are organized in three categories: (1) datasets created at the University of Chicago and/or the National Opinion Research Center (NORC); (2) datasets available through the University of Chicago, including the library and research centers on campus; and (3) publicly available datasets.

For a personal consultation about available repositories/data holdings and public use data, contact Elizabeth Foster, Social Sciences Data Librarian.

Array of Things
The Array of Things (AoT) is a collaborative effort among scientists, universities, federal/local government, industry partners, and communities to collect real-time data on urban environment, infrastructure, and activity for research and public use. AoT is an experimental urban measurement system comprising programmable, modular “nodes” with sensors and computing capability so that they can analyze data internally, for instance counting the number of vehicles at an intersection (and then deleting the image data rather than sending it to a data center). AoT nodes are installed in Chicago and a growing number of partner cities to collect real-time data on the city’s environment, infrastructure, and activity for research and public use. The concept of AoT is analogous to a “fitness tracker” for the city, measuring factors that impact livability in the urban environment, such as climate, air quality, and noise.
Access: Publicly Available.

CRDW COVID-19 Data Mart through CRI
The COVID-19 data mart includes de-identified structured data on UCMC patient demographics, encounters, diagnoses, labs, medications, flow sheets, and procedures. Additional data will be added based on resource availability and urgency.
Access: Data marts are accessible via UChicago Box. Contact Julie Johnson at with your name and CNET ID.

General Social Survey (GSS) from NORC
Gathers data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes; to examine the structure and functioning of society in general as well as the role played by relevant subgroups; to compare the United States to other societies in order to place American society in comparative perspective and develop cross-national models of human society; and to make high-quality data easily accessible to scholars, students, policy makers, and others, with minimal cost and waiting. The 2018 GSS includes modules on science, discrimination, attitudes regarding abortions, pets, quality of working life, self-assessments of physical and psychological health, attitudes toward people with mental health problems, the role of the natural environment in people’s lives, religion, social networks and social resources.
Access: Publicly Available.

National Social Life, Health, and Aging Project (NSHAP) from NORC
Measures the well-being of older, community-dwelling Americans by examining the interactions among physical health/illness, medication use, cognitive function, emotional health, sensory function, health behaviors, social connectedness, sexuality, and relationship quality. There are three waves available, with between 3,000 to over 4,750 respondents for each wave. For all waves to date, data collection has included three measurements: in-home interviews, biomeasures, and leave-behind respondent-administered questionnaires (LBQ).
Access: Publicly Available.

Pastor-Stambaugh Liquidity Factor
The Pastor-Stambaugh model (2003) aggregate liquidity data from 1962 through 2019.
Access: Publicly Available through the Fama-Miller Center at Chicago Booth

Atlas Muni
A municipal bond database that allows clients to leverage data and research to power their solutions such as investment research, financial models, investment advisory and financial software platforms. Data come with comprehensive obligor definition, unique overlapping debt and total debt data, fully integrated demographic/economic database, point-­in‐time data structure, and unbiased credit score.
Access: Request access to Atlas Muni Data through the Fama-Miller Center at Chicago Booth (

Bureau of Labor Statistics Dataset
Includes Inflation & Prices, Employment & Unemployment, and Pay & Benefits data processed for STATA users.
Access: Request access to the dataset through the Initiative on Global Markets at Chicago Booth (

Provides the latest in prevailing opinion about the future direction and level of US interest rates. Survey participants such as Bank of America, Goldman Sachs & Co., Swiss Re, Loomis, Sayles & Company, and J. P. Morgan Chase, provide forecasts for each of the next six quarters for several variables. Each forecaster’s predication is published along with the average, or consensus forecast, for each variable. There are also averages of the 10 highest and 10 lowest forecasts for each variable; median forecast to eliminate the effects of the extreme forecasts on the consensus; the number of forecasts raised, lowered, or left unchanged from a month ago; and a diffusion index that indicates shifts in sentiment that sometimes occur prior to changes in the consensus forecast.
Access: Available for free through the Fama-Miller Center at Chicago Booth

Consumer Brand Analytics
Data from annual survey of consumer brand preferences and related data for over 200 brands in 19 consumer brand product categories. Contains over 5,000 respondents.
Access: UChicago Library

CoreLogic Loan-Level Market Analytics
Combines mortgage, property, and anonymized borrower data with focused analytics to provide information on risk. Contains origination and performance data on more than 170 million agency and non-agency loans, dating back to 1999. Updated monthly.
Access: A Research Computing Center account is required.  For those without PI eligibility, a sponsoring PI must approve the account request.

Corelogic – Tax and Deed History
The nation’s largest repository of real estate data, including information on taxes, deeds, foreclosures, involuntary liens, HOA building permits, and MLS data.
Access: Contact Colleen Reda ( at the Becker-Friedman Institute to request access.

Consumer-level data linked to LPS McDash loan-level data for improved risk management. Provides consumer risk scores and default indicators for all mortgages within LPS McDash. Data is available from June, 2005 to present and is updated monthly.
Access: A Research Computing Center account is required.  For those without PI eligibility, a sponsoring PI must approve the account request.

Cross-National Time-Series Data Archive
Unique domestic conflict event data for every country include: Assassinations, General Strikes, Guerrilla Warfare, Major Government Crises, Purges, Riots, Revolutions, and Anti-government Demonstrations.
Access: UChicago Library

Digest of Education Statistics
A comprehensive source of statistical information on American education for the Pre-K through Post-Secondary level. Includes data from both government and private sources. Material is nationwide in scope on a variety of subjects, including the number of schools and colleges, teachers, enrollments, and graduates, in addition to educational attainment, finances, federal funds for education, libraries, and international education. Supplemental information provides background for evaluating education data.
Access: UChicago Library

Dominick’s Dataset
Covers store-level scanner data collected at Dominick’s Finer Foods from 1989 through 1994. This data is unique for the breadth of its coverage and for the information available on retail margins.
Access: Available through the Kilts Center at Chicago Booth

ED Data Express
Provides data from the U.S. Department of Education at the national, state, and district levels on elementary and secondary schools. Includes data from EDFacts, Consolidated State Performance Reports (CSPR), and the Department’s Budget Service office.
Access: UChicago Library

Contains data collected from 1985 through 1988 by the now-defunct ERIM division of A.C. Nielsen on panels of households in two midsized Midwestern cities. Information is available on the purchases of households in a number of product categories along with household demographic information. For a subset of households, TV viewing data is available to measure exposure to commercials involving the products in each of the ERIM categories.
Access: Available through the Kilts Center at Chicago Booth

Eurostat is the statistical office of the European Union. Eurostat data contains indicators (short-term, structural, theme-specific, etc.) on the European Union and the Euro area, the Member States, and their partners. Data is collected to identify statistical solutions for new policy needs, such as indicators for poverty measurement or sustainability.
Access: UChicago Library

FRED (Federal Reserve Economic Data)
Created and maintained by the Research Department at the Federal Reserve Bank of St. Louis, FRED is an online database consisting of hundreds of thousands of economic data time series from national, international, public, and private sources.
Access: UChicago Library

Global Health Observatory (WHO)
The GHO data repository is the World Health Organization’s gateway to health-related statistics for its 194 Member States. It provides access to over 1000 indicators on priority health topics including mortality and burden of diseases, the Millennium Development Goals (child nutrition, child health, maternal and reproductive health, immunization, HIV/AIDS, tuberculosis, malaria, neglected diseases, water and sanitation), noncommunicable diseases and risk factors, epidemic-prone diseases, health systems, environmental health, violence and injuries, equity among others.
Access: UChicago Library

Historical Statistics of the United States: Millennial Edition
Compilation of over 37,000 statistical series from over 1,000 sources, covering population, work and welfare, economic structure and performance, economic sectors, and governance and international relations. All tables include citations to data sources and descriptions of data anomalies. Each major section includes a signed essay that puts the statistics in historical context.
Access: UChicago Library

ICPSR (The Inter-University Consortium for Political and Social Research) is a repository of social science data that has been shared by researchers. It includes opinion polls, longitudinal studies, census data, social data and much more.
Access: UChicago Library

ICPSR Child and Family Data Archive
The ICPSR Child and Family Data Archive is the place to discover, access, and analyze data on early care, education, and families. The archive hosts datasets about young children, their families and communities, and the programs that serve them.
Access: UChicago Library

International Data Explorer
National Center for Education Statistics (NCES) data for the U.S. and more than 80 foreign education systems. Data comes from a number of large-scale, international studies: Program for International Student Assessment (PISA), Progress in International Reading Literacy Study (PIRLS), Trends in International Mathematics and ScienceStudy (TIMSS), Program for the International Assessment ofAdult Competencies (PIAAC), and Teaching and Learning International Survey (TALIS).
Access: UChicago Library

International Historical Statistics
International Historical Statistics is a compendium of national and international socio-economic data from 1750 to 2010. Includes data on population, labor force, agriculture, industry, trade, transport and communications, finance, commodity prices, education, and national accounts.
Access: UChicago Library

International Monetary Fund (IMF) Data
Statistics on all aspects of international and domestic government finance. Includes exchange rates, international liquidity, money and banking, interest rates, production prices, international transactions, and government and national accounts.
Access: UChicago Library

Korean Social Science Data Center (KSDC) 한국사회과학 데이터센터; Hanguk Sahoe Kwahak Deito Sento
Consisting of five databases: an integrated domestic statistical database (covering forty government statistical yearbooks in economics, education, labor, industry, information and communication, law, social affairs, population and census, health and welfare, and environment), domestic opinion polls database, an integrated foreign statistical database, election materials in foreign countries, and the Inter-University Consortium for Political and Social Research.
Access: UChicago Library

KOSSDA: Korean Social Science Data Archive
Data repository with a vast range of Korean qualitative and quantitative data and literature across social science disciplines, including politics, government, law, business, finance, society, culture, social problems, welfare, psychology, education, and regional studies.
Access: UChicago Library

Nielsen Ad Intel Data
Ad Intel Data cover advertising occurrences and advertisement impression and universe estimate information for a variety of media types across the United States. Data is updated annually, beginning with 2010, and is compatible with other Nielsen datasets.
Access: Available to UChicago tenured/tenure-track faculty and their post-doc and PhD advisees. Registration via the Kilts Center for Marketing at Chicago Booth is required.

Nielsen Consumer Panel Data
Consumer Panel Data comprise a representative panel of 40,000 to 60,000 households that continually provide information about their purchases. Includes demographic, geographic, and product ownership data for the panelists and product data for purchases.  Includes all Nielsen food/nonfood departments and retail channels. Covers the entire U.S. Data is updated annually, beginning with 2004, and is compatible with other Nielsen datasets.
Access: Available to UChicago tenured/tenure-track faculty and their post-doc and PhD advisees. Registration via the Kilts Center for Marketing at Chicago Booth is required.

Nielsen Retail Scanner Data
Retail Scanner Data consist of weekly pricing, volume, and store environment information generated by point-of-sale systems. Includes UPC data from more than 35,000 participating retailers across all U.S. markets.  Data is updated annually, beginning with 2006, and is compatible with other Nielsen datasets.
Access: Available to UChicago tenured/tenure-track faculty and their post-doc and PhD advisees. Registration via the Kilts Center for Marketing at Chicago Booth is required.

This repository brings together most of the statistical databases in SourceOECD, allowing quick searching in one place. Includes a wide range of data for OECD countries and selected non-member economies.
Access: UChicago Library

Political Slant of U.S. Daily Newspapers (2005)
Automated searches of newspaper articles and congressional records were conducted on 1,000 key phrases addressing such issues as abortion, gun control, taxes, health care, war, the environment, immigration policy, stem cell research, and minorities. The researchers then constructed a new index that measured the similarity of a news outlet’s language to that of a congressional Republican or Democrat. Access the dataset through the Chicago Booth Initiative on Global Markets. Subject areas: anthropology, history, policy research, political science, sociology
Access: Via ICPSR, requires VPN from off campus

Polling the Nations
Compilation of more than 14,000 public opinion surveys conducted by more than 700 polling organizations in the United States and more than 80 other countries from 1986 to the present.
Access: UChicago Library

Roper Center for Public Opinion Research
Archives thousands of polls and surveys from the United States and some 70 foreign countries undertaken by many polling organizations on a very wide range of topics. Many datasets may be downloaded.
Access: UChicago Library

Social Explorer
Detailed reference tool for current and historical U.S. Census data from 1790 to the present. Data may be generated by browsing maps or building reports.
Access: UChicago Library; licensed for five users at a time

Sports Market Analytics
SMA is a single resource for analytics of sports marketing trends, brand preferences, general media habits, sponsorship, social media, fantasy, eSports, participation, and sports equipment and footwear. It features wide-ranging resources tailored to meet specific needs of sports teams, media, leagues, agencies, manufacturers, retailers, and sports educational institutions.
Access: UChicago Library

Includes data on the global digital economy, industrial sectors, consumer markets, public opinion, media, demography and macroeconomic trends. Quantitative data from 425 economic sectors in 50 countries are provided with a range of infographic tools for analysis and visualisation. Available in English, French, German, and Spanish
Access: UChicago Library

Statista Global Consumer Survey
Includes data from a survey of 64,000 consumers from 27 countries, representing 83 percent of the world’s gross domestic product. The study includes up-to-date insights about over 50 industries and topics from the online and offline world, as well as more than 700 international brands from 25 categories.
Access: UChicago Library

The Supreme Court Database
More than 240 data points for every U.S. Supreme Court case from the 1791 to 2018 terms. Tables may be downloaded for use with Excel, R, or STATA, or datasets or cross-tabulations may be generated online.
Access: UChicago Library

United Nations Data (UNData)
UNData is a multi-disciplinary source providing official statistics produced by countries and compiled by the United Nations data system. The database also includes estimates and projections. Themes include Agriculture, Crime, Education, Employment, Energy, Environment, Health, HIV/AIDS, Human Development, Industry, Information and Communication Technology, National Accounts, Population, Refugees, Tourism, Trade, as well as the Millennium Development Goals indicators.
Access: UChicago Library

Wanfang Data
Includes data relating to Chinese history, society, religions, family planning, Chinese legislation and jurisdiction, industrial innovation/technological improvement, business opportunities, Traditional Chinese Medicine (TCM), industry intelligence (pharmaceuticals, chemicals, automobile, metallurgy, finance/banking, insurance, telecommunications, manufacturing, etc), Chinese food/cuisine, etc.
Access: UChicago Library

World Development Indicators Online
Includes the World Development Indicators and Global Development Finance databases. Contains over 500 timeseries indicators under the headings world view, environment, economy, states and markets, and global links. Data are at the country level only and are updated annually.
Access: UChicago Library

World Religion Database (Brill)
Contains detailed statistics on religious affiliation for every country of the world. It provides source material, including censuses and surveys, as well as best estimates for every religion to offer a definitive picture of international religious demography. It offers best estimates at multiple dates for each of the world’s religions for the period 1900 to 2050.
Access: UChicago Library

WTO Statistics Database
Provides quantitative information in relation to economic and trade policy issues. Its databases and publications provide access to data on trade flows, tariffs, non-tariff measures (NTMs), and trade in value added.
Access: UChicago Library

Baccalaureate and Beyond (B&B)
The Baccalaureate and Beyond Longitudinal Study (B&B) examines students’ education and work experiences after they complete a bachelor’s degree, with a special emphasis on the experiences of new elementary and secondary teachers. Following several cohorts of students over time, B&B looks at bachelor’s degree recipients’ workforce participation, income and debt repayment, and entry into and persistence through graduate school programs, among other indicators.
Access: Unrestricted data publicly available. Access to restricted data is available through the Sociology department and requires a brief proposal and a notarized affidavit; contact Beverly Levy for details on the application procedure.

Bureau of Economic Analysis
U.S. Department of Commerce data, including information on (1) National Accounts: GPD, personal income, corporate profits, fixed assets, and integrated macroeconomic accounts; (2) International Accounts: balance of payments, trade, and activities of multinational enterprises; (3) Regional: state and metro GDP, state and local area income, and regional input-output multipliers; and (4) Industry: GDP by industry accounts, input-output accounts, U.S. travel and tourism accounts, arts and cultural production accounts, and integrated BEA/BLS industry-level production accounts.
Access: Publicly Available

Bureau of Labor Statistics
Data compiled by the U.S. Bureau of Labor. Provides information for the U.S. on inflation and prices, pay and benefits, employment and unemployment, spending and time use, workplace injuries, occupational requirements, and productivity.
Access: Publicly Available

Corpus of Contemporary American English (COCA)
Contains more than one billion words of text (20 million words each year 1990-2019) from eight genres: spoken, fiction, popular magazines, newspapers, academic texts, TV and Movies subtitles, blogs, and other web pages.
Access: Open Access, restricted to 50 queries per day for free accounts

Corpus of Historical American English (COHA)
Contains more than 400 million words of text from the 1810s-2000s and is balanced by genre, decade by decade. Allows users to create virtual corpora, personalized collections of texts related to a particular area of interest, and to search specific genres.
Access: Publicly Available, restricted to 50 queries per day for free accounts

CPS School Data
Survey, Metrics, Demographic, Assessment, Accountability, and Annual Regional Analysis data for Chicago Public Schools.  Aggregated data available by school or for the district as a whole.
Access: Publicly Available

Early Childhood Longitudinal Studies (ECLS) Program
The Early Childhood Longitudinal Study (ECLS) program includes four longitudinal studies that examine child development, school readiness, and early school experiences. The birth cohort of the ECLS-B is a sample of children born in 2001 followed from birth through kindergarten entry. ECLS-K followed children from the kindergarten class of 1999 through eighth grade. ECLS-K:2011 followed children from the kindergarten class of 2011 through fifth grade. ECLS-K:2023 will follow children from the kindergarten class of 2023 through fifth grade.
Access: Unrestricted data publicly available. Access to restricted data is available through the Sociology department and requires a brief proposal and a notarized affidavit; contact Beverly Levy for details on the application procedure.

Economic Census
The Census Bureau’s Economic Census has measured U.S. economic activity every five years, since the first Census of Manufacturers in 1810. The scope has expanded to include retail and wholesale trade, construction industries, mining, and a broad array of service industries. The extensive and comprehensive data products include over 950 detailed industries across 18 industrial sectors classified using the North American Industry Classification System (NAICS).
Access: Publicly Available

Economic Indicators and Releases from NBER
Current economic data compiled by the National Bureau of Economic Research. Updated daily and includes links to archives, where available.
Access: Publicly Available

EconStats is an online database which compiles statistics from various official sources (BEA, Census, Federal Reserve, etc.). U.S. data include government debt and deficits, NIPA, GDP, PCE, industrial production and capacity utilization, durable goods, trade, wholesale inventory, housing, construction spending, retail sales, consumer credit, employment, interest rates, and major indicators.
Access: Publicly Available

Educational Longitudinal Study (ELS)
The Education Longitudinal Study of 2002 built upon the National Education Longitudinal Study of 1988. Students were tested in reading, mathematics, science, and social studies. Followup surveys were administered in 2004, 2006, and 2012.
Access: Unrestricted data publicly available. Access to restricted data is available through the Sociology department and requires a brief proposal and a notarized affidavit; contact Beverly Levy for details on the application procedure.

Harris Vault
Begun in 1963, the Harris Poll is one of the world’s longest running public opinion surveys. The Vault provides access to more than 3,000 U.S. and international polls dating back to 1970.
Access: Publicly Available

Head Start Impact Study (HSIS)
The National Head Start Impact Study is a longitudinal study that followed approximately 5,000 three and four year old preschool children through third grade. It compared school readiness and educational outcomes for children enrolled in Head Start to children not enrolled in the program.
Access: Unrestricted data publicly available. Access to restricted data is available through the Sociology department and requires a brief proposal and a notarized affidavit; contact Beverly Levy for details on the application procedure. 

Aggregation of more than 1000 dynamic and static data sets, compiled by the Cato Institute. Subjects and providers vary widely.
Access: Publicly Available

Illinois State Board of Education Report Card Data Library
The Illinois Report Card Data Library page is the repository for Report Card data available for public use. Includes statewide trend data, report card glossary of terms, and the public data files from which the Report Card is produced annually. The data files include student and teacher demographics, standardized test scores, graduation and truancy rates, financial information, and more.
Access: Publicly Available

Interactive Tariff and Trade DataWeb
Produced by the United States International Trade Commission. Customized import and export data that may be viewed by country, value, etc. or HTS, SIC, SITC, and NAICS codes.
Access: Publicly Available

International Statistical Agencies
A list of International statistical agencies, compiled by the US Census Bureau.
Access: Publicly Available

National Archive of Criminal Justice Data (NACJD)
Established in 1978, the National Archive of Criminal Justice Data (NACJD) archives and disseminates data on crime and justice for secondary analysis. The archive contains data from over 2,700 curated studies or statistical data series.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

National Archive of Criminal Justice Data (NACJD)
Established in 1978, the National Archive of Criminal Justice Data (NACJD) archives and disseminates data on crime and justice for secondary analysis. The archive contains data from over 2,700 curated studies or statistical data series.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

A list of economic datasets compiled by the National Bureau of Economic Research.
Access: Publicly Available

NCES Datalab
Collection of datasets provided by The National Center for Education Statistics (NCES), the primary federal entity for collecting and analyzing data related to education.
Access: Publicly Available

National Educational Longitudinal Study (NELS)
The National Education Longitudinal Study of 1988 was launched in the spring of the 1987-88. Students were tested in reading, mathematics, science, and social studies. Followup surveys were administered in 1990, 1992, and 2000.
Access: Unrestricted data publicly available. Access to restricted data is available through the Sociology department and requires a brief proposal and a notarized affidavit; contact Beverly Levy for details on the application procedure.

Project on Human Development in Chicago Neighborhoods Community Surveys
The Community Surveys, conducted in 1994-95, measured the structural conditions and organization of neighborhoods in Chicago with respect to the dynamic structure of the local community, the neighborhood organizational and political structures, cultural values, information and formal social control, and social cohesion.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

Project on Human Development in Chicago Neighborhoods Infant Assessment Unit
As part of the Longitudinal Cohort Study, 412 infants from the birth cohort and their primary caregivers were studied during wave 1 (1994-1997) to examine the effects of prenatal and postnatal conditions on the health and cognitive functioning of infants in the first year of life. The Infant Assessment Unit (IAU) also sought to link early developmental processes and the onset of antisocial behavior and to measure the strength of these relationships.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

Project on Human Development in Chicago Neighborhoods Longitudinal Cohort Study
The Longitudinal Cohort Study collected three waves of data (1994-97, 1997-99, 2000-01) from a sample of children, adolescents, young adults, and their primary caregivers. Seven randomly-selected cohorts of respondents were selected to study the changing circumstances of their lives and the personal characteristics that may lead them towards or away from a variety of antisocial behaviors.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

Project on Human Development in Chicago Neighborhoods Systematic Social Observations
Systematic Social Observation (SSO) is a standardized approach for directly observing the physical, social, and economic characteristics of neighborhoods, one block at a time. The main objective of the SSO was to measure the effects of neighborhood characteristics upon young people’s development, specifically the variables associated with youth violence. SSO data were collected in 1995.
Access: Unrestricted data publicly available. Restricted data requires DUA with endorsement from URA.

Regional Economic Accounts
Produced by the Bureau of Economic Analysis. Provides local area economic data for states, counties, and metropolitan areas for 1969 to the present. Statistics include: personal income and earnings, full and part employment, transfer payments, and farm income and expenses.
Access: Publicly Available

United Nations Main National Accounts Aggregates
National Accounts Statistics database contains a complete and consistent set of time series from 1970 onwards of main national accounts aggregates for all UN Members States and all other countries and areas in the world.
Access: Publicly Available

USA Trade Online
Detailed data on imports and exports for the United States. Coverage begins in 1992 and is updated monthly.
Access: Publicly Available.

Do not use any new tools for human subjects data without first contacting your IT department and IRB office in order to vet the product.

PIMCO Decision Research Virtual Laboratory
PIMCO and the Center for Decision Research at UChicago Booth have partnered to launch a virtual laboratory, facilitating online research in the fields of social and cognitive psychology, economics, and neuroscience. Using Qualtrics and Zoom, the Virtual Lab can administer surveys and qualitative interviews to their participant pool. IRB protocols must be amended to include new study delivery methods. New studies should be requested online at: Questions should be directed to Bryan Baird (

Qualtrics Survey Services
Qualtrics provides a robust system for generating, publishing and analyzing survey data, coupled with a powerful interface. The University of Chicago requires that all users conducting research with human subjects obtain IRB approval or exemption from the SBS-IRB office prior to engaging in any research activities (including publishing any surveys online, pilot testing, etc.).
Access: Available through Social Sciences Computing Services for SSD Faculty and Graduate students. Also available through IT Services.

Massachusetts Institute of Technology has developed a virtual study platform, allowing developmental researchers to conduct interactive online studies with babies and children. Though currently in beta testing, investigators are encouraged to begin readying their studies. UChicago has signed an institutional agreement with MIT, however individual researchers must submit the Terms of Use to URA for authorized signature. IRB protocols must be amended to include online activity before data collection commences.
Access: Publicly Available

Gorilla Experiment Builder
Gorilla is an online research platform that allows researchers to run virtual behavioral studies. It was developed with special attention to providing accurate reaction-time data. The included tools cover a range of methods from questionnaires to randomized controlled trials. A code editor is included, however the intuitive graphical interface requires no coding knowledge. The platform integrates with online participant recruitment services, such as Prolific, MTurk, and Sona.
Access: Publicly Available

Digital Archival Research Guide
This guide provides resources for archival research. In recent years, digital archival research in particular has emerged as an important aspect of research. Its overall benefits continue to make digital archival research an attractive option—below we recommend a variety of digital archives.


  • Cost-Efficient
    • Large quantities of data available at relatively low cost, especially when factoring in travel and other expenses
  • Unbiased Selection
    • Clear-cut collection procedure
  • Over-inclusive data
    • Large volumes of data can benefit future research

Potential Drawbacks

  • Imperfect Finding Aids
    • Risk of not finding more nuanced information
  • Time-Intensive Digitizing Process
    • Scanning delays and other issues can hold up delivery of materials
  • Overwhelming Amounts of Data
    • Time must be spent curating and making sense of digital files
  • Incomplete Data
    • Collections can be incomplete

Archival Research in the Digital Age
University of Chicago Social Sciences Dialogo article that covers how digitization of archives is transforming the academic research landscape, enabling faculty and students to deliver groundbreaking projects in months rather than years, and debunking prior historical assumptions that were based in incomplete datasets.

Guide to Archival Research
American Historical Association guide from the Graduate and Early Career Committee with suggestions on all aspects of planning an archival research.

ABBYY FineReader Software
Optical character recognition software to convert documents to editable and easily searchable text. 

UChicago Library Experts
Subject matter librarians are available for 1:1 consults to assist researchers, students, and staff with various resources, student success resources, and scholarly communication.

UChicago Library Subject Guides
University of Chicago Library Subject Guides provide a basic overview of the various resources available to specific subject matter areas. These guides serve as a central location to house subject specific resources such as related journals, digital archives, and finding aids.

UChicago Library Special Collections
The Hanna Holborn Gray Special Collections Research Center is the principal repository for and steward of the Library’s rare books, manuscripts, University Archives, and the Chicago Jazz Archives. Its mission is to provide primary sources to stimulate, enrich, and support research, teaching, learning, and administration at the University of Chicago. Special Collections makes these resources available to a broad constituency as part of the University’s engagement with the larger community of scholars and independent researchers.

UChicago Library Text Mining Resources
Text mining is a research technique using computational analysis to uncover patterns in large text-based data sets. Text and data mining is sometimes permitted according to the Library’s license agreements. This guide is a non-exclusive list of resources where the library has secured rights for text and data mining.

Foreign Relations of the United States (FRUS) series
The Foreign Relations of the United States (FRUS) series presents the official documentary historical record of major U.S. foreign policy decisions and significant diplomatic activity.

Central Intelligence Agency (CIA) CREST/FOIA
Since 2000, CIA has installed and maintained an electronic full-text searchable system named CREST (the CIA Records Search Tool). The CREST system is the publicly accessible repository of the subset of CIA records reviewed under the 25-year program in electronic format (manually reviewed and released records are accessioned directly into the National Archives in their original format).

The FOIA Electronic Reading Room is provided as a public service by the Office of the Chief Information Officer’s Information Management Services.

National Security Archive
The Digital National Security Archive is an invaluable online collection of more than 100,000 declassified records documenting historic U.S. policy decisions.

Wilson Center Digital Archive
The Digital Archive contains once-secret documents from governments all across the globe, uncovering new sources and providing fresh insights into the history of international relations and diplomacy.

Gale Declassified Documents Online
U.S. Declassified Documents Online’s greatest value lies in the wealth of facts and insights that it provides in connection with the political, economic, and social conditions of the United States and other countries. Materials as diverse as State Department political analyses, White House confidential file materials, National Security Council policy statements, CIA intelligence memoranda, and much more offer unique insights into the inner workings of the U.S. government and world events in the twentieth and twenty-first centuries.

Gale Archives Unbound
The Archives Unbound program has published more than 300 titles. The roots of the program are in microfilm, and the collection makes available targeted collections of interest to scholars engaged in serious research.

Particular strengths in the Archive Unbound catalog include U.S. foreign policy; U.S. civil rights; global affairs and colonial studies; and modern history. Broad topic clusters include: African American studies; American Indian studies; Asian studies; British history; Holocaust studies; LGBT studies; Latin American and Caribbean studies; Middle East studies; political science; religious studies; and women’s studies. The Archives Unbound program consists of more than 300,000 documents totaling more than 13 million pages. Individual titles in the collection range between 1,200 and 200,000 pages.

ProQuest History Vault
ProQuest History Vault debuted in 2011 and is continuously growing to include numerous archival collections documenting the most important and widely studied topics in eighteenth- through twentieth-century American history.

HathiTrust Digital Library
Founded in 2008, HathiTrust is a not-for-profit collaborative of academic and research libraries preserving 17+ million digitized items. HathiTrust offers reading access to the fullest extent allowable by U.S. copyright law, computational access to the entire corpus for scholarly research, and other emerging services based on the combined collection. HathiTrust members steward the collection—the largest set of digitized books managed by academic and research libraries—under the aims of scholarly, not corporate, interests.

Church of Latter-Day Saints Family Genealogy Sites

Best Practices for Digital Archival Research
  • Contact special collections librarians directly, be polite and reasonable in your expectations. Archivists are currently overwhelmed with demand for digitization, expect that there will be a backlog of at least one month and be flexible with scanning turnaround times.
  • Consult with archivists on finding aids and secondary sources to utilize in identifying potentially useful records
  • Consider hiring a private researcher and/or trade off private research assistance with researchers at other institutions
  • Example: Create an arrangement where a graduate student at another institution does scanning for you, while you scan documents at UChicago for them in return.
  • Trading off private research assistance can also have potential cost-saving benefits
  • General Tips
    • Consider asking archivists about hand lists and other non-digitized finding aids
    • Be very considerate of the research questions being asked; think about why things are and are not digitized
    • Contact colleagues and students in the field; many people are willing to share archives
    • Weigh benefits of over-/under-inclusivity

General Tips:

  • Make a list of basic databases
  • Determine what parts of the metadata are available for public use
  • Issues can occur when utilizing metadata that is not public, including a possible closure to access
  • Consult lists of commercial vendors to see what other databases exist
  • Commercial vendors can have materials that UChicago does not currently own
  • Ask librarians for a trial period and/or go to your committee to potentially purchase a new database
  • List of older finding aids:
  • Keep a set of the originals in the order received
  • Determine organization scheme (combining or separating digital records)
  • Consider utilizing optical character recognition (OCR) when processing data
  • Recommended Software: ABBYY FineReader
  • Free UChicago Library Text Mining Resources
  • Be deliberate and consistent with naming conventions (date of creation, description of object, number in series or sequential order)
  • Be aware of proprietary, licensing, usage terms, and copyright, when maintaining received data.
  • Consider long-term storage:
  • Be aware that local storage can be redundant
  • Consider the costs of digital storage
  • Conduct fixity checks and be wary of bit rot
  • Plan for the future – What happens when you leave the University
  • Who will maintain this in the future? Who will be in charge of future platform migration?