| Title: | Datasets for "Sampling and Data Analysis Using R: Theory and Practice" |
|---|---|
| Description: | Provides several datasets used throughout the book "Sampling and Data Analysis Using R: Theory and Practice" by Islam (2025, ISBN:978-984-35-8644-5). The datasets support teaching and learning of statistical concepts such as sampling methods, descriptive analysis, estimation and basic data handling. These curated data objects allow instructors, students and researchers to reproduce examples, practice data manipulation and perform hands-on analysis using R. |
| Authors: | Professor Dr. Mohammad Shahidul Islam [aut, cre] |
| Maintainer: | Professor Dr. Mohammad Shahidul Islam <[email protected]> |
| License: | CC BY 4.0 |
| Version: | 1.0 |
| Built: | 2026-06-02 09:34:28 UTC |
| Source: | https://github.com/cran/dauR |
A family library contains 1100 books. The owner is interested in exploring some features of the existing books through a short survey. Thirty books have been randomly selected, and four characteristics have been measured: the number of pages (Page), weight in grams (Weight), surface area in square inches (Surface), and type of each book (Type).
case_studycase_study
A data frame with 30 rows and 4 variables:
Number of pages for each book.
Weight of the book in grams.
Surface area of each book in square inches.
Categorical variable defining whether this is a religious, science, or story book.
Generated for the book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(case_study, package = "dauR") head(case_study)data(case_study, package = "dauR") head(case_study)
This dataset is about an engineering firm which monitors industrial machines over time to understand how long they operate before experiencing a mechanical failure. Each machine is tracked from installation until either it fails or the study period ends. The record includes each machine’s operating age (in years) and maintenance type of whether it receives regular maintenance or on-demand maintenance.
eng_dataeng_data
A dataframe with 4 variables and 223 observations:
Time to malfunction (in days)
Failure indicator (1 = still operational/censored, 2 = malfunctioned)
Age of machine at installation (in years)
Maintenance type (1 = On-demand, 2 = Regular)
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(eng_data, package = "dauR") head(eng_data)data(eng_data, package = "dauR") head(eng_data)
This dataset consists of a sample of eighty laborers from a large factory. Measurements on four variables, namely Gender, Age in years, Diastolic blood pressure and BMI are reported.
HealthHealth
A data frame with 80 rows and 4 variables:
Gender of individuals
Age of the respondents
Diastolic blood pressure of the respondents
BMI of the respondents
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(Health, package = "dauR") head(Health)data(Health, package = "dauR") head(Health)
A survey was conducted on a group of 15 students about teaching-learning environment of an institution. One part of the survey (comprises five questions) was dedicated for opinions about one recently completed course.
likert_datalikert_data
A data frame with 15 rows and 5 variables, all in likert-scale:
Question about whether this was a good experience
Question about whether enjoyed disturb-free internet connection
Question about whether got support from my institution
Question about whether instructors were good in distant teaching and new technology
Question about whetherit is better than in-class teaching
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(likert_data, package = "dauR") head(likert_data)data(likert_data, package = "dauR") head(likert_data)
The dataset contains 200 observations and five variables, namely reading_time, vocab, test_score, access_resources and read_motiv.
readingreading
A data frame with 200 rows and five variables:
Time spent in reading
Measure of richness of vocubulary
Score from the
Access to Resources
Measure of motivation for reading
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(reading, package = "dauR") head(reading)data(reading, package = "dauR") head(reading)
This dataset shows smoking habit of 198 drivers of three types of vehicles. The types of vehicles are Bus, Truck and Taxi. Here our aim is to find whether there is any association between smoking habit and the occupation type, represented here by the type of vehicle driven.
smoke_classsmoke_class
A data frame with 198 rows and 2 variables:
Type of vehicle driven
Categorial verible, showing whether smoker of non-smoker
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(smoke_class, package = "dauR") head(smoke_class)data(smoke_class, package = "dauR") head(smoke_class)
The dataset contains nine variables and 20 recently graduates from university and college. For a high-school job, students sat for exam on 7 subjects and then appeared for oral (viva-voce) exam. The subjects are Math, Physics, Chemistry, Statistics, Bengali literature, English literature and History.
students_datastudents_data
A dataframe with 9 variables and 20 observations:
Serial number
Whether a student is from science or humanities background
Score in Mathematics
Scores in Physics
Score in Chemistry
Score in Statistics
Score in English
Score in Bengali
Score in History
Score in Viva-voce
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(students_data, package = "dauR") head(students_data)data(students_data, package = "dauR") head(students_data)
A researcher seeks to investigate whether an individual’s life satisfaction (happiness) is associated with gender and working status. The variable Gender includes two categories: Male and Female, while Working_Status comprises three categories: Self-employed, Student and Job. A random sample of ten participants was selected from each category. Life satisfaction was measured on a scale ranging from 0 to 100, with higher scores indicating greater happiness. Therefore, the dependent variable is life satisfaction (happiness), and the independent variables are gender and working status.
twowaytwoway
A data frame with 60 rows and 3 variables:
Measurement of happiness
Gender of individuals
Working status of individuals. The classes are "Job", "Self_employed" and "Student"
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(twoway, package = "dauR") head(twoway)data(twoway, package = "dauR") head(twoway)
This dataset shows weight of 16 individuals before taking medication.
weight1weight1
A vector of 16 observations:
weight of individuals
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(weight1, package = "dauR") head(weight1)data(weight1, package = "dauR") head(weight1)
This dataset shows weight of 16 individuals after taking medication.
weight2weight2
A vector of 16 observations:
weight of individuals
Generated for book "Sampling and Data Analysis Using R: Theory and Practice" by Dr. Mohammad Shahidul Islam
data(weight2, package = "dauR") head(weight2)data(weight2, package = "dauR") head(weight2)