| Title: | Datasets from the Hello Data Science Book |
|---|---|
| Description: | Provides datasets used for analysis and visualizations in the open-access Hello Data Science book. |
| Authors: | Mine Dogucu [aut, cre] (ORCID: <https://orcid.org/0000-0002-8007-934X>), Catalina Medina [aut] (ORCID: <https://orcid.org/0000-0003-2847-8180>), Alma Castro [aut] |
| Maintainer: | Mine Dogucu <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.1 |
| Built: | 2026-06-12 20:08:01 UTC |
| Source: | https://github.com/cran/hellodatascience |
The 2024 data was downloaded from U.S. Bureau of Labor Statistics' website https://www.bls.gov/tus/data/datafiles-2024.htm and subset to include only respondents who are enrolled in college or university. This dataset is used only for educational purposes. Those conducting real research should download the data from its original source. BLS.gov cannot vouch for the data or analyses derived from these data after the data have been retrieved from BLS.gov.
atus_collegeatus_college
A data frame with 312 rows and 5 variables. Each row represents a college student.
full time or part time employment status of respondent
age
are you enrolled as a full-time or part-time student?
weekly earnings at main job
number of people living in respondent's household
total nonwork-related time respondent spent alone (in minutes)
time spent sleeping
time spent working at main job
time spent taking class for degree, certification, or licensure
time spent shopping (store, telephone, internet)
time spent taking a lunch break
time spent participating in sports, exercise, or recreation
time spent attending or participating in religious services
U.S. Bureau of Labor Statistics (2025). https://nssdc.gsfc.nasa.gov/planetary/factsheet/index.html.
The data was obtained from FIFA website https://inside.fifa.com/associations/ and contains information on the FIFA Member Associations (MAs), also known are confederations, which are responsible for the development and governance of football/soccer within their region
confederationsconfederations
A data frame with 7 rows and 2 variables. Each row represents a FIFA confederation, including FIFA itself.
name of the FIFA member association
region or continent overseed by the confederation
(2026). https://inside.fifa.com/associations/.
The data was obtained from The World Data website https://theworlddata.com/world-population-by-country/ which contains information on the Men's FIFA World Cup 2026 qualifying teams and their 2025 population, and from the WorldData.info website https://www.worlddata.info/capital-cities.php which contains information on the capital cities of all countries.
country_capitalcountry_capital
A data frame with 5 rows and 3 variables. Each row represents a country men's soccer team that qualified for the FIFA World Cup 2026.
name of the country
name of the country's capital city
population size in millions based on the United Nations Population Division estimates for 2025
(2026). https://theworlddata.com/world-population-by-country/.
(2025). https://www.worlddata.info/capital-cities.php.
The data was scraped from Whereig website https://www.whereig.com/football/fifa-world-rankings.html/ and contains information on the Men's FIFA World Cup 2026 qualifying teams and ranking data as of 01 April 2026
country_rankcountry_rank
A data frame with 7 rows and 3 variables. Each row represents a country men's soccer team that qualified for the FIFA World Cup 2026.
name of the country soccer team
FIFA world ranking as of April, 2026
region/continent association affiliated to FIFA
Whereig editors (2026). https://www.whereig.com/football/fifa-world-rankings.html/.
The data was gathered from The Soccer World Cups website https://www.thesoccerworldcups.com/world_cups.php and contains information about every World Cup played including national teams, standings, and more
mx_us_wc_ranksmx_us_wc_ranks
A data frame with 2 rows and 5 variables. Each row represents Mexico's and the United States' men soccer team and their final participation ranking at the last four world cup tournaments.
name of the country soccer team
final ranking in the 2010 world cup
final ranking in the 2014 world cup
final ranking in the 2018 world cup
final ranking in the 2022 world cup
(2026). https://www.thesoccerworldcups.com/world_cups.php.
The data was downloaded from https://www.rug.nl/ggdc/productivity/pwt/ and contains information about different economic measures of countries around the world. The dataset has been subset and variable names have been modified for exercise purposes.
penn_worldpenn_world
A data frame with 12810 rows and 14 variables. Each row represents a country in a specific year.
3-letter ISO country code
country name
currency unit
year
expenditure-side real GDP at chained PPPs (in mil. 2017US$)
output-side real GDP at chained PPPs (in mil. 2017US$)
population (in millions)
number of persons engaged (in millions)
average annual hours worked by persons engaged
price level of household consumption, price level of USA GDPo in 2017=1
price level of capital formation, price level of USA GDPo in 2017=1
price level of government consumption, price level of USA GDPo in 2017=1
price level of exports, price level of USA GDPo in 2017=1
price level of imports, price level of USA GDPo in 2017=1
Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), "The Next Generation of the Penn World Table" American Economic Review, 105(10), 3150-3182, available for download at http://www.ggdc.net/pwt/.
The data was scraped from NASA's website https://nssdc.gsfc.nasa.gov/planetary/factsheet/index.html and contains information on the planets of our Solar System
planetsplanets
A data frame with 8 rows and 7 variables. Each row represents a planet.
name of the planet
mass in 10^24 kg
length of day in hours
whether mean temperature in C is positive or not {negative}{positive}
number of moons
whether the planet has set of rings around it {TRUE} {FALSE}
surface pressure in bars
David R. Williams (2024). https://nssdc.gsfc.nasa.gov/planetary/factsheet/index.html.
How much do fruits and vegetables cost? United States Department of Agriculture (USDA) Economic Research Service (ERS), estimated average prices for 153 commonly consumed fresh and processed fruits and vegetables. USDA ERS calculated average prices at retail stores using 2022 retail scanner data from Circana (formerly Information Resources Inc. (IRI)). A selection of retail establishments—grocery stores, supermarkets, supercenters, convenience stores, drug stores, and liquor stores—across the United States provides Circana with weekly retail sales data (revenue and quantity).
produce_pricesproduce_prices
A data frame with 155 rows and 10 variables:
ID of item
name of produce
form of produce, either 'Canned', 'Dried', 'Fresh', 'Frozen', or 'Juice'
average retail price per pound or per pint
unit for the 'retail_price', either 'per pint' or 'per pound'
For most fruits and vegetables, a cup equivalent is the edible portion that will fit into a 1-cup measuring cup; for raisins and other dried fruit, it is the edible portion that will fit into a 1/2-cup; and for leafy vegetables, 2 cups. An edible cup equivalent is the unit of measurement used by the U.S. Department of Agriculture and the Department of Health and Human Services to report fruit and vegetable consumption recommendations.
unit for 'cup_equivalent_size'
average retail price per 'cup_equivalent_unit' of produce
type of produce, either 'fruit' or 'vegetables'
year
# Add more items for each column
U.S. Department of Agriculture, Economic Research Service. (2024). Fruit and vegetable prices. https://www.ers.usda.gov/data-products/fruit-and-vegetable-prices