Need data right now? These sources can get you large, clean datasets quickly!
The FBI's Crime Data Explorer (CDE) aims to provide transparency, create easier access, and expand awareness of criminal, and noncriminal, law enforcement data sharing; improve accountability for law enforcement; and provide a foundation to help shape public policy with the result of a safer nation. Use the CDE to discover available data through visualizations, download data in .csv format, and other large data files.
Public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.
A public clearinghouse for information about the City of Rochester. The site features spatial and non-spatial data in a variety of formats including tables, feature layers, applications, story maps, documents, and web maps.
Datasets and codebooks in political science, health and welfare, education and economics.
Large, open-access datasets designed for machine learning projects. Allows searching by number of rows and columns.
Census Bureau data is the data dissemination platform to access demographic and economic data from the U.S. Census Bureau.
Published annually by the U.S. Department of Agriculture, Agriculture Statistics provides information on agricultural production, supplies, consumption, facilities, costs, and returns.
Public archive of Census of Agriculture publications published 1945-1987 in PDF format.
The USDA's National Agricultural Statistics Service (NASS) conducts hundreds of surveys every year and prepares reports covering virtually every aspect of U.S. agriculture.
Crops reports from 1940 - 2010, compiled by the Stanislaus County Agricultural Commissioner.
The Census of Agriculture is the leading source of facts and figures about American agriculture. It is the only source of uniform, comprehensive agricultural data for every state and county in the United States.
Artnome has numerous links to large open research datasets on art collections. This includes the Museum of Modern Art (MoMA), The Metropolitan Museum of Art, the Tate Collection, Carnegie Museum of Art, Nationalmuseum Stockholm, Museum für Kunst und Gewerbe Hamburg, and Yale Center for British Arts.
Music datasets created by CompMusic at Universitat Pompeu Fabra. Datasets are on topics such as Indian Art Music, Turkish Makam Music, and Beijing Opera.
The Getty Provenance Index provides access to archival inventories, sales catalogs, and dealer stock books. There is also a database of collectors’ files, a database on payments to artists, and a database containing descriptions and provenances of paintings created before 1900, which were held in Great Britain and the United States from 1500 to 1990.
The National Archive of Data on Arts and Culture (NADAC) is a repository that facilitates research on arts and culture by acquiring data, particularly those funded by federal agencies and other organizations, and sharing those data with researchers, policymakers, people in the arts and culture field, and the general public.
An open access dataset which includes factual art object information for the 130,000+ artworks and artists at the National Gallery of Art.
In support of open data, the Shelby White & Leon Levy Digital Archives Metadata Dataset and The Complete New York Philharmonic Subscribers Dataset has been made open to the public.
NIST Chemistry Webbook This site provides chemical data for a wide variety of chemical substances. The database can be searched via chemical name, molecular formula, molecular structure, CAS registry number and chemical reaction.
Sigma Aldrich Catalog Sigma Aldrich provides some spectra and chemical information for many of their chemical products. MSDS are also available.
Spectral Database for Organic Compounds, SDBS Spectra from Japan's National Institute of Advanced Industrial Science and Technology (AIST)
2020 census data available for users who want to download the set of detailed tables for all of the geographies within a state and run their own analysis and rankings. Also see other Census Data organized by Decade.
Includes data on churches and church membership, religious professionals, and religious groups (individuals, congregations and denominations).
Provides access to detailed tables and maps for population, housing, and businesses.
These resources provide State and national statistics on child and family well-being indicators, such as health, child care, education, income, and marriage.
Historical, current and projected socioeconomic data for the United States, regions, and counties. Find info on population by age and race, employment by industry, earnings of employees by industry, personal income by source, households by income bracket and retail sales by kind of business. (1970-2050).
Primary source of labor force statistics for the population of the United States
Public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.
Major statistical data sources on disability
Military casualty data and Active Duty military/civilian personnel statistics by rank/grade, service totals, service by Region/Country.
Database of over 600,000 emails generated by 158 employees of the Enron Corporation and acquired by the Federal Energy Regulatory Commission during its investigation after the company's collapse.
Data related to development: literacy, health, poverty, income inequality, climate change, crime, population, and more.
Assesses child well-being nationally via 16 key indicators 1990-present
Click on "International Statistical Links" to retrieve an A-Z country list with national statistical bureau links attached.
Gathers social and economic information on Mexican-US migration.
Search, customize and download datasets for national and international variables.
The National Center for Children in Poverty (NCCP) is the nation’s leading public policy center dedicated to promoting the economic security, health, and well-being of America’s low-income families and children.
Data and research studies cover all areas of social policy in the UK.
Compendium of tables that provides data on foreign nationals, permanent legal residents, naturalized citizens and maps with various demographic characteristics (1996-present).
Includes multidimensional poverty index and specific summaries on the results of the MPI analyses in 104 developing nations.
PovcalNet is an interactive computational tool that allows you to replicate the calculations made by the World Bank's researchers in estimating the extent of absolute poverty in the world, including the $1 a day poverty measures.
UNDP is the United Nations' global development network, an organization advocating for change and connecting countries to knowledge, experience and resources to help people build a better life.
A compilation of international development indicators and socio-economic data.
Provides a yearly overview of the economic, social, and environmental state of the world. Also provides detailed economic data for most countries, quality of life indicators, and other demographic and environmental information.
Uniform Crime Reports - The Uniform Crime Reports (from the FBI) are produced from data provided by nearly 17,000 law enforcement agencies across the United States.
Contains data on crime and victimization, arrests, dispositions, law enforcement personnel, and more.
Contains statistics for Criminal Justice Characteristics, Public Opinion, "Crime, Victims," "Arrests, Seizures," "Courts, Prosecution, Sentencing," and "Parole, Jails, Prisons, Death Penalty.
Contains economic data for EU member states, EU candidate countries and other OECD countries (United States, Japan, Canada, Switzerland, Norway, Iceland, Mexico, Korea, Australia and New Zealand).
Current and historical data for the VIX, a barometer for measuring investor sentiment and market volatility.
Contains several thousand economic time series, produced by a number of U.S. Government agencies and distributed in a variety of formats and media
Access structural and aggregated financial information & quarterly reports on FDIC-insured institutions
USAID data--Explains where U.S. foreign aid is invested in over 100 countries.
A Database of International Business Statistics with over 5000 variables from over 200 countries. Data available from mid-1990s to present. (Free registration is required to access data sets).
Provides information regarding home mortgage lending activity.
Data on IMF lending, exchange rates, trade statistics and other economic and financial indicators.
In-depth economic analyses of the home building industry based on private and government data
Contains useful economic datasets for download
Stats for OECD countries and selected non-member economies.
University of Michigan Index of consumers sentiment as well as surveys of investors, affluent consumers, and regional data.
Contains data on factors such as poverty, income, employment, and health insurance coverage
Interactive national, international, regional economic data or industry statistics.
Databases, Tables & Calculators by Subject for US labor statistics.
Futures and options markets data
Provides a complete historical record of all foreign assistance provided by the United States to the rest of the world.
World Investment Report (annual) covers trends and analysis in foreign direct investment
Data included in FedStats but " this website concentrates on "...statistics and reports on children and families."
Ed Watch Interactive is a user-friendly source of data on educational performance and equity by race and class, kindergarten through college.
Explore hundreds of measures of well-being for kids across the nation, or in your state, city, or community.
State Profiles presents key data about each state's performance in the National Assessment of Educational Progress (NAEP) in mathematics, reading, writing, and science for grades 4 and 8.
The Nation's Report Card presents the results of the National Assessment of Educational Progress (NAEP), which measures student achievement in the U.S. in various subjects over time.
Collects, analyzes and makes available data related to education in the U.S. and other nations.
Contains data and statistics collected from New York schools and learning support resources.
The New York State Report Cards provide enrollment, demographic, attendance, suspension, dropout, teacher, assessment, accountability, graduation rate, post-graduate plan, career and technical education, and fiscal data for public and charter schools, districts, and the State.
Research and Data on NYC Schools.
TIMSS and PIRLS are international assessments that monitor trends in student achievement in mathematics, science, and reading.
Contains over 1,000 types of indicators and raw data on education, literacy, science and technology, culture and communication for more than 200 countries and territories
Electoral and popular vote results from 1789 to the present for US Presidential Elections.
Contains data on voting, public opinion and political participation. Cumulative, time-series, panel and contextual data are available for download.
The Center for American Women and Politics (CAWP), a unit of the Eagleton Institute of Politics at Rutgers, The State University of New Jersey, is nationally recognized as the leading source of scholarly research and current data about American women’s political participation. This site presents data and analysis of women's voting behavior, including statistics on turnout and the gender gap in voting.
Official vote counts from 1920 to the present for presidential and congressional elections compiled by the Office of the Clerk of the U.S. House of Representatives. This site also contains links to election resources found on the websites of the Census Bureau, National Archives, Federal Election Commission, and state election offices.
Data and analyses of US elections, including data on presidential, congressional, and gubernatorial elections, political parties, campaigns, and demographic information.
Downloadable campaign finance information
A space to share and improve election data. You can create an account and either correct the cataloging information for the studies in this dataverse or upload new data files.
Four ICPSR studies that provide datasets of electoral returns for approx. 90% of all elections to the offices of president, governor, United States senator, and United States representative for all parties and candidates 1788-1990. Most returns are at the county level.
A list of online and print resources that contain U.S. election statistics for both federal and state elections.
The Office of the Federal Register at the National Archives coordinates the functions of the Electoral College on behalf of the Archivist of the United States, the States, and the Congress. This site contains the electoral votes and popular votes from 1789 to the present.
Provides contribution and lobbying datasets in American politics
Use a keyword search for elections
Voting and Registration data have been collected biennially by the U.S. Census Bureau in the November Current Population Survey (CPS). The statistics presented on this website are based on replies to survey inquiries about whether individuals were registered and/or voted in specific national elections. For the purpose of these estimates, election types are considered to be either congressional or presidential.
Dr. Michael McDonald, an Associate Professor at George Mason University, provides national and state voter turnout statistics from 1980 to the present.
Contains approximately 1100 elections from 70+ countries at a constituency level for lower house legislative elections. Includes votes received by each candidate/party, total votes cast, number of eligible voters, and seat figures where available. Available in Stata, SPSS, and raw data formats. Dates are variable, but generally 1945+
Build data sets on national and subnational elections around the world 1940s-2010s. Accessible in multiple formats: spreadsheets, tables and GIS maps.
Information on international elections: Subnational elections of high interest; Political parties and candidates; Referendum provisions; News on election-related laws and developments around the world; Political institutions and electoral systems; Election results and voter turnout.
BP, one of the world's largest energy companies, publishes statistical reviews of world energy, projections and historical data.
Data on environmentally sustainable energy programs in developing countries.
Energy statistics by country
Monthly data for the biggest oil producing and consuming countries.
A program of the U.S. Department of Education's National Center for Education Statistics, this site provides annual and national statistics for all public elementary and secondary schools, and school districts across the U.S. Data can be located under "Quick Facts", "Data", or by searching for a specific school or school district under "School/District Locator." Fiscal and Nonfiscal reports, and working papers can be located under "Publications."
Contains both experimental and evaluated nuclear data including nuclear reaction ((the properties of interacting nuclei, e.g. cross sections) and nuclear structure (the properties of single nuclei) data.
A journal covering the global energy market (1998-present).
Company level data on the supply and disposition of natural gas in the United States, Electric power data collected by surveys, international energy statistics, energy country profiles for 217 countries, state and territory energy profiles for the U.S., financial data collected from major energy producers, short-term and historical energy outlook data & projections, and real energy prices.
Provides energy data for more than 215 countries, areas and regions on the production, trade and intermediate and final consumption for primary and secondary conventional, non-conventional and new and renewable sources of energy from 1990 onwards.
Provides data sets on energy, climate, forests, water, & sustainability.
A joint BOEM and NOAA initiative providing authoritative data to meet the needs of the offshore energy and marine planning communities.
This section of this Library Guide was developed using a template from SUNY Geneseo, originally created by Bonnie Swoger and Justina Elmore. It is licensed under a Creative Commons Attribution 4.0 International License. Any part of it may be used as long as credit is included. Derivative works can be licensed under Creative Commons to encourage sharing and reuse of educational materials.
Public health statistics
WONDER online databases utilize a rich ad-hoc query system for the analysis of public health data. Reports and other query systems are also available.
Datasets, tools, and applications gathered from agencies across the Federal government
Digital Repository is a curated resource that makes research data discoverable, freely reusable, and citable.
A centralized and comprehensive source of information and analyses on global health R&D activities for human disease.
Data from 219 countries and areas on the prevalence of HIV infection and AIDS cases and deaths.
Contains current data from surveys such as the National Health Interview Survey (NHIS), the National Health and Nutrition Examination Survey (NHANES), birth and mortality detail files, National Immunization Survey, Longitudinal Study of Aging, and National Survey of Family Growth (NSFG)
NIH-supported data repositories that make data accessible for reuse
The PCS is conducted every two years, and collects demographic, clinical, and service-related information for each person who receives a public mental health service during a specified one-week period.
Estimates of life expectancy at birth—the average number of years a person can expect to live—for most of the census tracts in the United States for the period 2010-2015.
This archive has free access to original historical linguistic data from 400 indigenous languages. You do need to register if you want to download files.
Includes ZIP files of the British National Bibliography, as well as Open RDF/XML data on the British National Bibliography, British Library Integrated Catalogue (Books), and British Library printed music. Finally, there are numerous history datasets (i.e., Black History Month, Magna Carta, Women’s rights, etc.), literature datasets (datasets focused on famous authors), music datasets (i.e., History of Music), and other datasets (i.e., National parks, Theology, etc.).
Created by Harvard University’s Library Innovation Lab, the Caselaw Access Project (CAP) has 360 years of United States caselaw, and access to API and Bulk Data Service.
A collection of digital humanities resources curated by Alan Liu. Includes collections and datasets.
A collection of fully searchable and SGML/XML-encoded texts corresponding with the Early English Books online. Data created by partnership with ProQuest and more than 150 libraries.
The END project generates high-quality metadata about novels published between 1660 and 1850 in order to make early works of fiction more available to both traditional and computational modes of humanistic study.
A collaborative effort by Northwestern University and Washington University in St. Louis to transform the early English print record, from 1473 to the early 1700s, into a linguistically annotated and deeply searchable text corpus.
GDELT monitors print, broadcast, and web news media in over 100 languages from across every country in the world to keep continually updated on breaking developments everywhere on the planet.
Metadata representing the curated collection of HathiTrust volumes. Datasets include NovelTM Datasets for English-Language Fiction, 1700-2009, Word Frequencies in English-Language Literature, 1700-1922, and Geographic Locations in English-Language Literature, 1701-2011.
A collection of publicly available humanities datasets.
This peer-reviewed journal describes humanities research objects with high potential for reuse. These might include curated resources like (annotated) linguistic corpora, ontologies, and lexicons, as well as databases, maps, atlases, linked data objects, and other datasets created with qualitative, quantitative, or computational methods.
Derivative datasets of the Library of Congress Web Archives. Includes, but is not limited to, datasets on the United States Elections Web Archive, the American Folklife Center’s Web Cultures Web Archive, and the Webcomics Web Archive.
A growing repository of datasets related to culture and the humanities. Notable datasets include one on African American Literature, Colonial South Asian Literature, txtLAB’s Multilingual Novels, U.S. Inaugural Addresses, Refugee Arrivals to the U.S. from 2005-2015, etc.
List of natural language processing datasets which can be found on GitHub.
A repository of full-text literary and linguistic resources. Includes thousands of texts in more than 25 languages.
A data project focused on using modern data analysis techniques to great texts in the history of philosophy. Current corpus has over 50 texts and 30 authors.
TalkBank is the world’s largest open access integrated repository for spoken language data. It provides language corpora and other audio resources to support researchers in Psychology, Linguistics, Education, Computer Science, and Speech Pathology.
This archive possesses recordings from more than 200 languages.
Created by the World History Center at the University of Pittsburgh, the World-Historical Dataverse possesses links to many datasets connected to history all over the world.
International labor statistics, standards, key indicators of the labor market, labor force surveys, safety, work conditions, child labor, and labor legislation.
Information on international trade and economic development trends, markets, and labor force.
Provides international labor force and wage information and other economic statistics (1990-present).
Contains international policies and data on employment, health, families and children, pension systems, international migration and other social policies and data. Downloadable files in Excel, CSV, PC-axis, or XML.
Database covers four key elements of modern political economies in advanced capitalist societies: Institutional Characteristics of Trade Unions, Wage Setting, State Intervention and Social Pacts in 49 countries between1960 and 2010
Provides detailed, comparative tables for social protection systems for 31 countries. Areas covered include financing, healthcare, sickness, maternity, invalidity, old-age, survivors, employment injuries and occupational diseases, family, unemployment, guaranteed minimum resources and long-term care.
Highlights the principal features of social security programs in more than 170 countries. Public use data downloadable as TXT or SAS zip files.