Title: | A Comprehensive Collection of U.S. Datasets |
---|---|
Description: | Provides a diverse collection of U.S. datasets encompassing various fields such as crime, economics, education, finance, energy, healthcare, and more. It serves as a valuable resource for researchers and analysts seeking to perform in-depth analyses and derive insights from U.S.-specific data. |
Authors: | Renzo Caceres Rossi [aut, cre] |
Maintainer: | Renzo Caceres Rossi <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-12-08 07:16:11 UTC |
Source: | CRAN |
The dataset name has been changed to 'acs12_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble data frame, helping to differentiate it from other datasets within the package. The original content of the dataset has not been modified in any way.
data(acs12_tbl_df)
data(acs12_tbl_df)
A tibble with 2,000 observations and 13 variables:
Income of individuals (integer).
Employment status (factor with 3 levels).
Number of hours worked per week (integer).
Race of individuals (factor with 4 levels).
Age of individuals (integer).
Gender of individuals (factor with 2 levels: "male", "female").
Citizenship status (factor with 2 levels: "no", "yes").
Time taken to travel to work in minutes (integer).
Primary language spoken at home (factor with 2 levels: "english", "other").
Marital status (factor with 2 levels: "no", "yes").
Educational attainment (factor with 3 levels).
Disability status (factor with 2 levels).
Birth quarter of individuals (factor with 4 levels).
American Community Survey, 2012.
The dataset name has been changed to 'age_at_mar_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble data frame, helping to differentiate it from other datasets within the package. The original content of the dataset has not been modified in any way.
data(age_at_mar_tbl_df)
data(age_at_mar_tbl_df)
A tibble with 5,534 observations and 1 variable:
Age at first marriage (integer).
United States Census Data.
The dataset name has been changed to 'airlines_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble, helping to differentiate it from other datasets within the package. The original content of the dataset has not been modified in any way.
data(airlines_tbl_df)
data(airlines_tbl_df)
A tibble with 16 observations and 2 variables:
Carrier code (character) representing the airline.
Name of the airline (character).
U.S. Department of Transportation.
The dataset name has been changed to 'airports_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble, helping to differentiate it from other datasets within the package. The original content of the dataset has not been modified in any way.
data(airports_tbl_df)
data(airports_tbl_df)
A tibble with 1,458 observations and 8 variables:
FAA airport code (character).
Name of the airport (character).
Latitude of the airport (numeric).
Longitude of the airport (numeric).
Altitude of the airport (numeric).
Time zone (numeric).
Daylight saving time flag (character).
Time zone name (character).
U.S. Federal Aviation Administration (FAA).
The dataset name has been changed to 'airquality_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'df' identifies the dataset as a data frame, helping to differentiate it from other datasets within the package. The original content of the dataset has not been modified in any way.
data(airquality_df)
data(airquality_df)
A data frame with 153 observations and 6 variables:
Ozone concentration (parts per billion) from 1 to 331.
Solar radiation (watts per square meter).
Wind speed (miles per hour).
Temperature (degrees Fahrenheit).
Month of the observation (integer from 5 to 9).
Day of the observation (integer from 1 to 31).
United States Environmental Protection Agency (EPA).
The dataset name has been changed to 'ames_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(ames_tbl_df)
data(ames_tbl_df)
A tibble with 2,930 observations and 82 variables:
Row number in the dataset.
Parcel Identifier.
Total house area in square feet.
Sale price of the house.
Building class type.
Zoning classification of the property.
Lot frontage length in feet.
Total lot area in square feet.
Street type access to the property.
Alley type access.
Shape of the lot.
Land contour around the property.
Availability of utilities.
Lot configuration.
Slope of the land.
Neighborhood in Ames.
Proximity to main conditions like railroads.
Proximity to secondary conditions.
Type of building.
Architectural style of the house.
Overall quality of the materials and finish.
Overall condition of the house.
Year the house was built.
Year of the last remodel or addition.
Roof style.
Roof material.
Primary exterior material.
Secondary exterior material.
Masonry veneer type.
Masonry veneer area in square feet.
Exterior material quality.
Condition of the exterior material.
Type of foundation.
Basement quality.
Basement condition.
Basement exposure to the outside.
Type 1 of finished basement.
Square feet of finished basement type 1.
Type 2 of finished basement.
Square feet of finished basement type 2.
Unfinished basement area in square feet.
Total basement area in square feet.
Type of heating system.
Heating system quality.
Presence of central air conditioning.
Type of electrical system.
First floor area in square feet.
Second floor area in square feet.
Low-quality finished area in square feet.
Number of full bathrooms in the basement.
Number of half bathrooms in the basement.
Number of full bathrooms above ground.
Number of half bathrooms above ground.
Number of bedrooms above ground.
Number of kitchens above ground.
Kitchen quality.
Total number of rooms above ground.
Functionality of the house.
Number of fireplaces.
Fireplace quality.
Type of garage.
Year the garage was built.
Garage finish type.
Number of cars the garage can accommodate.
Garage area in square feet.
Garage quality.
Garage condition.
Indicates whether the driveway is paved.
Wood deck area in square feet.
Open porch area in square feet.
Enclosed porch area in square feet.
Three-season porch area in square feet.
Screened porch area in square feet.
Pool area in square feet.
Pool quality.
Type of fence.
Miscellaneous features of the property.
Value of miscellaneous features.
Month the house was sold.
Year the house was sold.
Type of sale.
Condition of the sale.
Ames Housing Dataset, provided by Dean De Cock
The dataset name has been changed to 'births_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(births_tbl_df)
data(births_tbl_df)
A tibble with 150 observations and 9 variables:
Age of the father (in years).
Age of the mother (in years).
Number of weeks of pregnancy.
Indicates if the baby is premature (factor: yes/no).
Number of prenatal visits.
Weight gained by the mother during pregnancy (in pounds).
Birth weight of the baby (in grams).
Sex of the baby (factor: male/female).
Indicates if the mother smoked during pregnancy (factor: yes/no).
National Vital Statistics Reports
The dataset name has been changed to 'births14_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(births14_tbl_df)
data(births14_tbl_df)
A tibble with 1,000 observations and 13 variables:
Age of the father (in years).
Age of the mother (in years).
Indicates if the mother is mature (yes/no).
Number of weeks of pregnancy.
Indicates if the baby is a premature birth (yes/no).
Number of prenatal visits.
Weight gained by the mother during pregnancy (in pounds).
Birth weight of the baby (in grams).
Indicates if the baby is of low birth weight (yes/no).
Sex of the baby (male/female).
Maternal smoking habits (yes/no).
Marital status of the mother (married/single).
Indicates if the mother is white (yes/no).
National Vital Statistics Reports
The dataset name has been changed to 'Boston_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix '_df' identifies the dataset as a data frame. The original content of the dataset has not been modified in any way.
data(Boston_df)
data(Boston_df)
A data frame with 506 observations and 14 variables:
Per capita crime rate by town.
Proportion of residential land zoned for lots over 25,000 sq. ft.
Proportion of non-retail business acres per town.
Charles River dummy variable (1 if tract bounds river; 0 otherwise).
Nitric oxides concentration (parts per 10 million).
Average number of rooms per dwelling.
Proportion of owner-occupied units built prior to 1940.
Weighted distances to five Boston employment centers.
Index of accessibility to radial highways.
Full-value property tax rate per $10,000.
Pupil-teacher ratio by town.
1000(Bk - 0.63)^2 where Bk is the proportion of Black residents by town.
Percentage of lower status of the population.
Median value of owner-occupied homes in $1000s.
Boston Housing Data
The dataset name has been changed to 'Cars93_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix '_df' identifies the dataset as a data frame. The original content of the dataset has not been modified in any way.
data(Cars93_df)
data(Cars93_df)
A data frame with 54 observations and 6 variables:
Type of the car (factor with 3 levels).
Price of the car (in US dollars).
Miles per gallon in the city.
Drive train type (factor with 3 levels).
Number of passengers the car can accommodate.
Weight of the car (in pounds).
1993 Cars Data
The dataset name has been changed to 'census_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(census_tbl_df)
data(census_tbl_df)
A tibble with 500 observations and 8 variables:
Year of the census (in integer).
FIPS code for the state (factor with 47 levels).
Total family income (in US dollars).
Age of the individual (in years).
Sex of the individual (factor: male/female).
General race category (factor with 8 levels).
Marital status of the individual (factor with 6 levels).
Total personal income (in US dollars).
US Census Bureau
The dataset name has been changed to 'cia_factbook_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(cia_factbook_tbl_df)
data(cia_factbook_tbl_df)
A tibble with 259 observations and 11 variables:
Name of the country (factor with 259 levels).
Total area of the country (in square kilometers).
Birth rate (number of live births per 1,000 people).
Death rate (number of deaths per 1,000 people).
Infant mortality rate (number of deaths of infants under one year old per 1,000 live births).
Number of internet users (in millions).
Life expectancy at birth (in years).
Maternal mortality rate (number of maternal deaths per 100,000 live births).
Net migration rate (number of migrants per 1,000 people).
Total population of the country.
Population growth rate (percentage).
CIA World Factbook
The dataset name has been changed to 'cle_sac_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(cle_sac_tbl_df)
data(cle_sac_tbl_df)
A tibble with 500 observations and 8 variables:
Year of the observation (integer).
State of the observation (factor with 2 levels).
City of the observation (character).
Age of the individual (integer).
Sex of the individual (factor with 2 levels).
Race of the individual (character).
Marital status of the individual (character).
Personal income of the individual (integer).
Cleveland Study
The dataset name has been changed to 'county_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(county_tbl_df)
data(county_tbl_df)
A tibble with 3,142 observations and 15 variables:
Name of the county.
State in which the county is located (factor with 51 levels).
Population of the county in the year 2000.
Population of the county in the year 2010.
Population of the county in the year 2017.
Change in population over the years.
Poverty rate in the county.
Rate of homeownership in the county.
Percentage of multi-unit housing.
Unemployment rate in the county.
Indicates if the county is in a metropolitan area (factor with 2 levels).
Median education level in the county (factor with 4 levels).
Per capita income in the county.
Median household income in the county.
Indicates if there is a smoking ban in place (factor with 3 levels).
United States Census Bureau
The dataset name has been changed to 'env_regulation_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(env_regulation_tbl_df)
data(env_regulation_tbl_df)
A tibble with 705 observations and 1 variable:
Environmental regulation statement (character).
Environmental Regulation Study
The dataset name has been changed to 'fcid_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(fcid_tbl_df)
data(fcid_tbl_df)
A tibble with 100 observations and 2 variables:
Height of the individual (numeric).
Number of adults in the household (integer).
Family Characteristics and Income Study
The dataset name has been changed to 'goog_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(goog_tbl_df)
data(goog_tbl_df)
A tibble with 98 observations and 7 variables:
Date of the stock price observation (factor with 98 levels).
Opening price of the stock (numeric).
Highest price during the trading session (numeric).
Lowest price during the trading session (numeric).
Closing price of the stock (numeric).
Number of shares traded (integer).
Adjusted closing price of the stock (numeric).
Google Stock Market Data
The dataset name has been changed to 'govrace10_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(govrace10_tbl_df)
data(govrace10_tbl_df)
A tibble with 37 observations and 23 variables:
Identification number (numeric).
State name (character).
State abbreviation (character).
Name of the first candidate (character).
Percentage of votes for the first candidate (numeric).
Political party of the first candidate (character).
Number of votes for the first candidate (numeric).
Name of the second candidate (character).
Percentage of votes for the second candidate (numeric).
Political party of the second candidate (character).
Number of votes for the second candidate (numeric).
Name of the third candidate (character).
Percentage of votes for the third candidate (numeric).
Political party of the third candidate (character).
Number of votes for the third candidate (numeric).
Name of the fourth candidate (character).
Percentage of votes for the fourth candidate (numeric).
Political party of the fourth candidate (character).
Number of votes for the fourth candidate (numeric).
Name of the fifth candidate (character).
Percentage of votes for the fifth candidate (numeric).
Political party of the fifth candidate (character).
Number of votes for the fifth candidate (numeric).
2010 Gubernatorial Races
The dataset name has been changed to 'homicides15_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(homicides15_tbl_df)
data(homicides15_tbl_df)
A tibble with 1922 observations and 15 variables:
Unique identifier (integer).
City name where the homicide occurred (character).
Offense code (character).
Type of offense (character).
Date of the homicide (POSIXct).
Location address of the homicide (character).
Longitude of the homicide location (numeric).
Latitude of the homicide location (numeric).
Type of location where the homicide occurred (character).
Category of the location (character).
FIPS code of the state (integer).
FIPS code of the county (character).
Census tract where the homicide occurred (character).
Block group number (integer).
Block number (integer).
2015 Homicides Data
The dataset name has been changed to 'house_tbl_df' to avoid confusion with other packages in the R ecosystem from which datasets have been sourced. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and assists users in identifying its specific characteristics. The suffix 'tbl_df' identifies the dataset as a tibble. The original content of the dataset has not been modified in any way.
data(house_tbl_df)
data(house_tbl_df)
A tibble with 116 observations and 12 variables:
Congress number (numeric).
Starting year of the congress (numeric).
Ending year of the congress (numeric).
Total number of seats in the House of Representatives (numeric).
Abbreviation of the first party (character).
Number of seats for the first party (numeric).
Abbreviation of the second party (character).
Number of seats for the second party (numeric).
Number of seats for other parties (numeric).
Number of vacant seats (numeric).
Number of delegate seats (numeric).
Number of resident commissioner seats (numeric).
Historical House of Representatives Data
The dataset name has been changed to 'houserace10_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(houserace10_tbl_df)
data(houserace10_tbl_df)
A tibble with 435 observations and 24 variables:
Unique race identifier (numeric).
Name of the state (character).
State abbreviation (character).
District number (numeric).
Name of the first candidate (character).
Percentage of votes for the first candidate (numeric).
Party affiliation of the first candidate (character).
Number of votes for the first candidate (numeric).
Name of the second candidate (character).
Percentage of votes for the second candidate (numeric).
Party affiliation of the second candidate (character).
Number of votes for the second candidate (numeric).
Name of the third candidate (character).
Percentage of votes for the third candidate (numeric).
Party affiliation of the third candidate (character).
Number of votes for the third candidate (numeric).
Name of the fourth candidate (character).
Percentage of votes for the fourth candidate (numeric).
Party affiliation of the fourth candidate (character).
Number of votes for the fourth candidate (numeric).
Name of the fifth candidate (character).
Percentage of votes for the fifth candidate (numeric).
Party affiliation of the fifth candidate (character).
Number of votes for the fifth candidate (numeric).
2010 U.S. House of Representatives Election Data
The dataset name has been changed to 'immigration_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(immigration_tbl_df)
data(immigration_tbl_df)
A tibble with 910 observations and 2 variables:
Factor indicating the response to immigration-related questions, with 4 levels.
Factor indicating the political alignment associated with the responses, with 3 levels.
Data from surveys on immigration attitudes
The dataset name has been changed to 'leg_mari_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(leg_mari_tbl_df)
data(leg_mari_tbl_df)
A tibble with 119 observations and 1 variable:
Factor indicating responses related to legal marijuana, with 2 levels.
Data from surveys on attitudes towards legal marijuana
The dataset name has been changed to 'marathon_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(marathon_tbl_df)
data(marathon_tbl_df)
A tibble with 59 observations and 3 variables:
Integer indicating the year of the marathon event.
Factor indicating the gender of the participants, with 2 levels.
Numeric value representing the marathon completion time in hours.
Data from marathon event results
The dataset name has been changed to 'military_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(military_tbl_df)
data(military_tbl_df)
A tibble with an unspecified number of observations and 6 variables:
Factor indicating the military grade, with 3 levels.
Factor indicating the branch of the military, with 4 levels.
Factor indicating the gender of the participants, with 2 levels.
Factor indicating the race of the participants, with 7 levels.
Logical indicating whether the participants identify as Hispanic.
Integer representing the rank of the participants.
Data from military personnel demographics
The dataset name has been changed to 'minn38_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a data frame. The original content of the dataset has not been modified.
data(minn38_df)
data(minn38_df)
A data frame with 168 observations and 5 variables:
Factor indicating the high school status, with 3 levels.
Factor indicating the post-high school status, with 4 levels.
Factor indicating the field of study, with 7 levels.
Factor indicating the gender of the participants, with 2 levels.
Integer representing the associated numerical value for the participants.
Data from the Minnesota 1938 study
The dataset name has been changed to 'mlb_players_18_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(mlb_players_18_tbl_df)
data(mlb_players_18_tbl_df)
A tibble with 1270 observations and 19 variables:
Character string representing the name of the player.
Character string indicating the team the player belongs to.
Character string indicating the position played by the player.
Integer representing the number of games played.
Integer indicating the number of at-bats.
Integer representing the number of runs scored.
Integer representing the number of hits.
Integer indicating the number of doubles hit.
Integer indicating the number of triples hit.
Integer representing the number of home runs hit.
Integer indicating the number of runs batted in.
Integer indicating the number of walks received.
Integer indicating the number of strikeouts.
Integer representing the number of stolen bases.
Integer indicating the number of times caught stealing.
Numeric representing the batting average.
Numeric representing the on-base percentage.
Numeric representing the slugging percentage.
Numeric representing the on-base plus slugging percentage.
Data from Major League Baseball (MLB) player statistics for the 2018 season
The dataset name has been changed to 'mn_police_use_of_force_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a data frame. The original content of the dataset has not been modified.
data(mn_police_use_of_force_df)
data(mn_police_use_of_force_df)
A data frame with 12925 observations and 13 variables:
Character string representing the date and time of the response.
Character string describing the nature of the problem.
Character string indicating whether the incident was initiated by a 911 call.
Character string indicating the primary offense involved in the incident.
Character string describing the injuries sustained by the subject, if any.
Character string describing the type of force used by the police.
Character string describing the specific actions related to the use of force.
Character string indicating the race of the subject involved in the incident.
Character string indicating the sex of the subject.
Integer representing the age of the subject.
Character string describing the type of resistance offered by the subject.
Character string indicating the precinct in which the incident occurred.
Character string representing the neighborhood where the incident occurred.
Data from police use of force reports in Minnesota
The dataset name has been changed to 'nba_players_19_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(nba_players_19_tbl_df)
data(nba_players_19_tbl_df)
A tibble with 494 observations and 7 variables:
Character string representing the player's first name.
Character string representing the player's last name.
Character string indicating the name of the team.
Character string representing the team's abbreviation.
Character string indicating the player's position on the team.
Character string representing the player's jersey number.
Numeric value representing the player's height.
Data from NBA players' statistics in 2019
The dataset name has been changed to 'ncbirths_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(ncbirths_tbl_df)
data(ncbirths_tbl_df)
A tibble with 1000 observations and 13 variables:
Integer representing the father's age.
Integer representing the mother's age.
Factor with 2 levels indicating if the mother is mature (>=35 years).
Integer representing the number of gestation weeks.
Factor with 2 levels indicating if the baby was born prematurely.
Integer representing the number of prenatal visits.
Factor with 2 levels indicating the marital status of the mother.
Integer representing the mother's weight gain during pregnancy (in pounds).
Numeric value representing the baby's birth weight (in grams).
Factor with 2 levels indicating if the baby was born with low birth weight.
Factor with 2 levels indicating the baby's gender.
Factor with 2 levels indicating if the mother has a smoking habit.
Factor with 2 levels indicating if the mother is white.
Data from birth records in North Carolina
The dataset name has been changed to 'nyc_marathon_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(nyc_marathon_tbl_df)
data(nyc_marathon_tbl_df)
A tibble with 102 observations and 7 variables:
Numeric value representing the year the marathon took place.
Character value representing the name of the runner.
Character value indicating the country of origin of the runner.
Time variable in 'hms' format representing the finish time of the runner.
Numeric value representing the finish time of the runner in hours.
Character value indicating the division (category) the runner participated in.
Character value containing additional notes, if any, about the runner or the race.
Data from the New York City Marathon records
The dataset name has been changed to 'nycvehiclethefts_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(nycvehiclethefts_tbl_df)
data(nycvehiclethefts_tbl_df)
A tibble with 35,746 observations and 9 variables:
Integer value representing a unique identifier for each vehicle theft incident.
Character value representing the single date of the theft incident.
Character value representing the start date of the theft incident.
Character value representing the end date of the theft incident.
Numeric value indicating the longitude where the incident occurred.
Numeric value indicating the latitude where the incident occurred.
Character value representing the type of location where the theft took place.
Character value indicating the category of the location.
Character value indicating the census block where the incident took place.
Data from the New York City Vehicle Thefts records
The dataset name has been changed to 'offshore_drilling_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(offshore_drilling_tbl_df)
data(offshore_drilling_tbl_df)
A tibble with 828 observations and 2 variables:
Factor with 4 levels, representing different responses or categories related to offshore drilling.
Factor with 3 levels, representing secondary categories or classifications related to the responses in v1
.
Data related to offshore drilling opinions or classifications
The dataset name has been changed to 'orings_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(orings_tbl_df)
data(orings_tbl_df)
A tibble with 23 observations and 4 variables:
Integer representing the mission number.
Integer representing the launch temperature in Fahrenheit.
Integer representing the number of damaged O-rings in the mission.
Numeric representing the number of undamaged O-rings in the mission.
Data from NASA missions related to O-ring performance.
The dataset name has been changed to 'oscars_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(oscars_tbl_df)
data(oscars_tbl_df)
A tibble with 184 observations and 11 variables:
Numeric indicating the Oscar number.
Numeric representing the year the Oscar was awarded.
Character string indicating the category of the award.
Character string with the name of the recipient.
Character string indicating the movie for which the award was given.
Numeric indicating the age of the recipient at the time of the award.
Character string indicating the birthplace of the recipient.
Date representing the birthdate of the recipient.
Numeric indicating the birth month.
Numeric indicating the birth day.
Numeric indicating the birth year.
Data from historical Oscar award records.
The dataset name has been changed to 'piracy_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(piracy_tbl_df)
data(piracy_tbl_df)
A tibble with 534 observations and 8 variables:
Character string indicating the name of the politician.
Factor with 3 levels representing the politician's party affiliation.
Factor with 50 levels indicating the U.S. state the politician represents.
Numeric representing the amount of pro-piracy funding received.
Numeric representing the amount of anti-piracy funding received.
Integer indicating the number of years in office.
Factor with 5 levels indicating the politician's stance on piracy.
Factor with 2 levels indicating the chamber of the U.S. Congress (House or Senate).
Data on political stances and funding related to piracy.
The dataset name has been changed to 'precip_numeric' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a numeric vector. The original content of the dataset has not been modified.
data(precip_numeric)
data(precip_numeric)
A numeric vector with 70 observations representing average annual precipitation (in inches) for various cities in the United States.
Numeric value representing the average annual precipitation in Mobile.
Numeric value representing the average annual precipitation in Juneau.
Numeric value representing the average annual precipitation in Phoenix.
Numeric value representing the average annual precipitation in Los Angeles.
Additional cities included in the dataset.
Data on precipitation for various U.S. cities.
The dataset name has been changed to 'presidents_ts' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a time series object. The original content of the dataset has not been modified.
data(presidents_ts)
data(presidents_ts)
A time series object with 120 observations, covering quarterly data from 1945 to 1975. Each observation represents the number of presidents' approval ratings during a given quarter. The data is structured as follows:
Numeric values representing the approval ratings for the first quarter.
Numeric values representing the approval ratings for the second quarter.
Numeric values representing the approval ratings for the third quarter.
Numeric values representing the approval ratings for the fourth quarter.
Data on presidential approval ratings from 1945 to 1975.
The dataset name has been changed to 'prrace08_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(prrace08_tbl_df)
data(prrace08_tbl_df)
A tibble with 51 observations and 7 variables:
Factor indicating the U.S. state (including Washington D.C.) where the election took place.
Factor providing the full name of the U.S. state corresponding to the abbreviation.
Integer representing the number of votes received by Barack Obama in the state.
Numeric representing the percentage of total votes received by Barack Obama in the state.
Integer representing the number of votes received by John McCain in the state.
Numeric representing the percentage of total votes received by John McCain in the state.
Integer indicating the number of electoral votes allocated to the state.
Data on the 2008 U.S. presidential race results by state.
The dataset name has been changed to 'road_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a data frame. The original content of the dataset has not been modified.
data(road_df)
data(road_df)
A data frame with 26 observations and 6 variables:
Integer indicating the number of road deaths.
Integer representing the number of licensed drivers.
Numeric indicating the population density (people per square mile).
Numeric indicating the percentage of rural roads.
Integer representing the average temperature (in degrees Fahrenheit).
Numeric indicating the fuel consumption per capita (in gallons).
Data on road safety statistics, including deaths, drivers, population density, and environmental factors.
The dataset name has been changed to 'senaterace10_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(senaterace10_tbl_df)
data(senaterace10_tbl_df)
A tibble with 38 observations and 23 variables:
Numeric identifier for the election race.
Character string indicating the U.S. state where the election took place.
Character string representing the state abbreviation.
Character string indicating the name of the first candidate.
Numeric indicating the percentage of votes received by the first candidate.
Character string indicating the party affiliation of the first candidate.
Numeric indicating the total votes received by the first candidate.
Character string indicating the name of the second candidate.
Numeric indicating the percentage of votes received by the second candidate.
Character string indicating the party affiliation of the second candidate.
Numeric indicating the total votes received by the second candidate.
Character string indicating the name of the third candidate.
Numeric indicating the percentage of votes received by the third candidate.
Character string indicating the party affiliation of the third candidate.
Numeric indicating the total votes received by the third candidate.
Character string indicating the name of the fourth candidate.
Numeric indicating the percentage of votes received by the fourth candidate.
Character string indicating the party affiliation of the fourth candidate.
Numeric indicating the total votes received by the fourth candidate.
Character string indicating the name of the fifth candidate.
Numeric indicating the percentage of votes received by the fifth candidate.
Character string indicating the party affiliation of the fifth candidate.
Numeric indicating the total votes received by the fifth candidate.
Data on U.S. Senate races held in 2010, including candidates' names, vote percentages, and party affiliations.
The dataset name has been changed to 'sp500_1950_2018_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(sp500_1950_2018_tbl_df)
data(sp500_1950_2018_tbl_df)
A tibble with 17346 observations and 7 variables:
Factor indicating the date of the recorded stock prices.
Numeric representing the opening price of the stock.
Numeric representing the highest price of the stock during the day.
Numeric representing the lowest price of the stock during the day.
Numeric representing the closing price of the stock.
Numeric representing the adjusted closing price of the stock.
Numeric representing the trading volume of the stock.
Historical data on S&P 500 stock prices from 1950 to 2018.
The dataset name has been changed to 'sp500_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(sp500_tbl_df)
data(sp500_tbl_df)
A tibble with 50 observations and 12 variables:
Factor indicating the stock ticker symbol of the company.
Numeric representing the market capitalization of the company.
Numeric representing the enterprise value of the company.
Numeric representing the trailing price-to-earnings ratio.
Numeric representing the forward price-to-earnings ratio.
Numeric representing the enterprise value to revenue ratio.
Numeric representing the profit margin of the company.
Numeric representing the total revenue generated by the company.
Numeric representing the growth rate of the company.
Numeric representing the earnings before interest and taxes (EBIT).
Numeric representing the cash holdings of the company.
Numeric representing the total debt of the company.
Data on S&P 500 companies, including financial metrics and ratios.
The dataset name has been changed to 'state_abb_character' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a character vector. The original content of the dataset has not been modified.
data(state_abb_character)
data(state_abb_character)
A character vector with 50 elements representing U.S. state abbreviations:
Character vector of state abbreviations, e.g., "AL" for Alabama, "CA" for California.
U.S. state abbreviations.
The dataset name has been changed to 'state_area_numeric' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a numeric dataset. The original content of the dataset has not been modified.
data(state_area_numeric)
data(state_area_numeric)
A numeric dataset with 50 elements representing the area of U.S. states in square kilometers:
Numeric values indicating the area of each state, measured in square kilometers.
U.S. state areas.
The dataset name has been changed to 'state_center_list' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a list. The original content of the dataset has not been modified.
data(state_center_list)
data(state_center_list)
A list with 2 elements, each containing numeric values representing the geographical center coordinates of U.S. states:
Numeric vector of length 50 representing the x-coordinates (longitude) of the state centers.
Numeric vector of length 50 representing the y-coordinates (latitude) of the state centers.
Geographical data for U.S. state centers.
The dataset name has been changed to 'state_division_factor' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a factor. The original content of the dataset has not been modified.
data(state_division_factor)
data(state_division_factor)
A factor with 50 observations representing the divisions of U.S. states. It contains 9 levels:
Region including Alabama, Kentucky, Mississippi, and Tennessee.
Region including California, Oregon, and Washington.
Region including Colorado, Idaho, Montana, Nevada, Utah, and Wyoming.
Region including Arkansas, Louisiana, Oklahoma, and Texas.
Region including Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, and Vermont.
Region including Delaware, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, Washington, D.C., and West Virginia.
Region including Illinois, Indiana, Michigan, Ohio, and Wisconsin.
Region including Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, and South Dakota.
Region including New Jersey, New York, and Pennsylvania.
U.S. Census Bureau regional divisions.
The dataset name has been changed to 'state_name_character' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a character vector. The original content of the dataset has not been modified.
data(state_name_character)
data(state_name_character)
A character vector with 50 observations representing the names of U.S. states.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
Name of the state.
U.S. Census Bureau.
The dataset name has been changed to 'state_region_factor' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a factor variable representing U.S. state regions.
data(state_region_factor)
data(state_region_factor)
A factor variable with 50 observations, representing the region of each U.S. state. The regions are classified into four levels:
States located in the Northeast region.
States located in the Southern region.
States located in the North Central region.
States located in the Western region.
U.S. Census Bureau.
The dataset name has been changed to 'state_x77_matrix' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a matrix variable representing various demographic and statistical attributes of U.S. states in 1977.
data(state_x77_matrix)
data(state_x77_matrix)
A matrix with 50 rows and 8 columns representing various demographic and statistical characteristics of U.S. states. The columns include:
Population of the state.
Median income of the state's residents.
Illiteracy rate (percentage).
Life expectancy (in years).
Murder rate (per 100,000 inhabitants).
High school graduation rate (percentage).
Number of days with frost.
Total area of the state (in square miles).
U.S. Census Bureau (1977).
The dataset name has been changed to 'UCBAdmissions_table' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a table object. The original content of the dataset has not been modified.
data(UCBAdmissions_table)
data(UCBAdmissions_table)
A table object with 24 entries representing the admissions data at U.C. Berkeley:
A factor with levels "Admitted" and "Rejected".
A factor with levels "Male" and "Female".
A factor representing the department with levels "A", "B", "C", "D", "E", and "F".
Numeric counts of admissions based on gender and department.
U.C. Berkeley admissions data from 1973.
The dataset 'us_crime_rates_spec_tbl_df' contains crime statistics for the United States, including various types of crimes and population data for each year. This dataset is structured as a tibble for ease of use within the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package.
data(us_crime_rates_spec_tbl_df)
data(us_crime_rates_spec_tbl_df)
A tibble with 60 rows and 12 columns:
Numeric year of the recorded data, e.g., 2000, 2001.
Numeric population total for the respective year.
Numeric total number of crimes reported.
Numeric total number of violent crimes.
Numeric total number of property crimes.
Numeric total number of murders.
Numeric total number of forcible rapes.
Numeric total number of robberies.
Numeric total number of aggravated assaults.
Numeric total number of burglaries.
Numeric total number of larcenies.
Numeric total number of vehicle thefts.
Federal Bureau of Investigation (FBI) Uniform Crime Reporting (UCR) Program.
The dataset 'us_temp_tbl_df' contains temperature records from various weather stations across the United States, providing both maximum and minimum temperature readings. This dataset is structured as a tibble for ease of use within the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package.
data(us_temp_tbl_df)
data(us_temp_tbl_df)
A tibble with 10,118 rows and 9 columns:
Character string representing the weather station identifier.
Character string for the name of the weather station.
Numeric value for the latitude of the weather station.
Numeric value for the longitude of the weather station.
Numeric value for the elevation of the weather station in meters.
Date of the recorded temperature data.
Numeric value for the maximum temperature recorded (in degrees Celsius).
Numeric value for the minimum temperature recorded (in degrees Celsius).
Factor representing the year of the recorded data.
National Oceanic and Atmospheric Administration (NOAA).
The dataset name has been changed to 'us_time_survey_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a tibble. The original content of the dataset has not been modified.
data(us_time_survey_tbl_df)
data(us_time_survey_tbl_df)
A tibble with 11 observations and 8 variables representing time use in various activities:
Numeric value representing the year of the survey.
Numeric value representing time spent on household activities (in hours).
Numeric value representing time spent on eating and drinking (in hours).
Numeric value representing time spent on leisure and sports activities (in hours).
Numeric value representing time spent sleeping (in hours).
Numeric value representing time spent caring for children (in hours).
Numeric value representing time spent working while employed (in hours).
Numeric value representing the number of days worked while employed.
U.S. Bureau of Labor Statistics.
The dataset name has been changed to 'USAccDeaths_ts' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a time series object. The original content of the dataset has not been modified.
data(USAccDeaths_ts)
data(USAccDeaths_ts)
A time series object with 72 observations representing monthly accidental deaths in the U.S. from 1973 to 1979:
A numeric vector representing the years from 1973 to 1979.
A character vector representing the months from January to December.
Numeric values representing the number of accidental deaths for each month.
U.S. accidental deaths data.
The dataset name has been changed to 'USArrests_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a data frame. The original content of the dataset has not been modified.
data(USArrests_df)
data(USArrests_df)
A data frame with 50 observations and 4 variables representing the rates of arrests in the U.S.:
Numeric vector representing the murder rates per 100,000 residents.
Integer vector representing the assault rates per 100,000 residents.
Integer vector representing the percentage of the population living in urban areas.
Numeric vector representing the rape rates per 100,000 residents.
U.S. arrests data from 1973.
The dataset name has been changed to 'UScitiesD_dist' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a distance object. The original content of the dataset has not been modified.
data(UScitiesD_dist)
data(UScitiesD_dist)
A distance object containing the distances (in miles) between selected U.S. cities:
Distance from Atlanta to other cities.
Distance from Chicago to other cities.
Distance from Denver to other cities.
Distance from Houston to other cities.
Distance from Los Angeles to other cities.
Distance from Miami to other cities.
Distance from New York to other cities.
Distance from San Francisco to other cities.
Distance from Seattle to other cities.
Distance from Washington D.C. to other cities.
U.S. cities distance data.
This package provides a wide variety of datasets related to crime, economy, society, politics, and sports within the United States for testing, learning, and research purposes.
usdatasets: A Comprehensive Collection of U.S. Datasets
A Comprehensive Collection of U.S. Datasets.
Maintainer: Renzo Cáceres Rossi [email protected]
Useful links:
The dataset name has been changed to 'USJudgeRatings_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a data frame. The original content of the dataset has not been modified.
data(USJudgeRatings_df)
data(USJudgeRatings_df)
A data frame with 43 observations and 12 variables representing ratings for U.S. judges:
Numeric vector representing the judges' ratings on control.
Numeric vector representing the judges' ratings on integrity.
Numeric vector representing the judges' ratings on demeanor.
Numeric vector representing the judges' ratings on diligence.
Numeric vector representing the judges' ratings on communications with clients.
Numeric vector representing the judges' ratings on decisiveness.
Numeric vector representing the judges' ratings on preparation.
Numeric vector representing the judges' ratings on family law expertise.
Numeric vector representing the judges' ratings on oral communications.
Numeric vector representing the judges' ratings on written communications.
Numeric vector representing the judges' ratings on physical appearance.
Numeric vector representing the judges' ratings on overall rating.
U.S. judge ratings data.
The dataset name has been changed to 'USPersonalExpenditure_matrix' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a matrix. The original content of the dataset has not been modified.
data(USPersonalExpenditure_matrix)
data(USPersonalExpenditure_matrix)
A matrix with 5 rows and 5 columns representing U.S. personal expenditures in different categories over selected years:
Numeric values representing expenditures on food and tobacco for the years 1940, 1945, 1950, 1955, and 1960.
Numeric values representing expenditures on household operations for the years 1940, 1945, 1950, 1955, and 1960.
Numeric values representing expenditures on medical and health services for the years 1940, 1945, 1950, 1955, and 1960.
Numeric values representing expenditures on personal care for the years 1940, 1945, 1950, 1955, and 1960.
Numeric values representing expenditures on private education for the years 1940, 1945, 1950, 1955, and 1960.
U.S. personal expenditure data.
The dataset name has been changed to 'uspop_ts' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a time series object. The original content of the dataset has not been modified.
data(uspop_ts)
data(uspop_ts)
A time series object with 19 observations representing the U.S. population from 1790 to 1970:
Numeric vector containing the population values in millions.
U.S. Census Bureau.
The dataset name has been changed to 'VADeaths_matrix' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a matrix. The original content of the dataset has not been modified.
data(VADeaths_matrix)
data(VADeaths_matrix)
A matrix containing mortality rates (per 1000) for different demographic groups in Virginia:
Mortality rates for rural males by age group.
Mortality rates for rural females by age group.
Mortality rates for urban males by age group.
Mortality rates for urban females by age group.
Virginia mortality data.
The dataset name has been changed to 'voter_count_spec_tbl_df' to avoid confusion with other packages in the R ecosystem. This naming convention helps distinguish this dataset as part of the 'usdatasets' package and identifies it as a special tibble. The original content of the dataset has not been modified.
data(voter_count_spec_tbl_df)
data(voter_count_spec_tbl_df)
A special tibble containing voting statistics across different years and regions:
Year of the election.
Region of the voters.
Total population eligible to vote.
Total number of ballots counted.
Total votes for the highest office.
Percentage of total ballots counted.
Percentage of votes for the highest office.
Election data from various sources.
The dataset name has been kept as 'women_df' to maintain consistency with other datasets in the R ecosystem. This naming convention helps clearly identify this dataset within the context of its application. The original content of the dataset has not been modified.
data(women_df)
data(women_df)
A data frame containing measurements of women's height and weight:
Height of women in inches.
Weight of women in pounds.
Based on statistical data for women's height and weight.