Package 'drawsample' reference manual

Title:	Draw Samples with the Desired Properties from a Data Set
Description:	A tool to sample data with the desired properties.Samples can be drawn by purposive sampling with determining distributional conditions, such as deviation from normality (skewness and kurtosis), and sample size in quantitative research studies. For purposive sampling, a researcher has something in mind and participants that fit the purpose of the study are included (Etikan,Musa, & Alkassim, 2015) <doi:10.11648/j.ajtas.20160501.11>.Purposive sampling can be useful for answering many research questions (Klar & Leeper, 2019) <doi:10.1002/9781119083771.ch21>.
Authors:	Kubra Atalay Kabasakal [aut, cre] , Huseyin Yıldız [ctb]
Maintainer:	Kubra Atalay Kabasakal <katalay@hacettepe.edu.tr>
License:	MIT + file LICENSE
Version:	1.0.1
Built:	2025-03-25 07:02:29 UTC
Source:	CRAN

Draw Samples with the Desired Properties from a Data Set

Description

draw_sample, functions take a sample of the specified sample size,skewness, and kurtosis form a data set (dist)with or without resampling. Fleishman's power method (doi:10.1007/BF02293811) was used for the desired skewness and kurtosis level. Therefore, the coefficient of skewness can be chosen between 0 and 3.6. Although the kurtosis coefficient varies for each skewness coefficient and varies from -1.2 and 20. If convenient kurtosis and skew values are not provided, no solutions can be found and an error is given.

Author(s)

Maintainer: Kubra Atalay Kabasakal katalay@hacettepe.edu.tr (ORCID)

Other contributors:

Huseyin Yıldız huseyinyildiz35@gmail.com (ORCID) [contributor]

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Atalay Kabasakal, K. & Gunduz, T . (2020). Drawing a Sample with Desired Properties from Population in R Package “drawsample”.Journal of Measurement and Evaluation in Education and Psychology,11(4),405-429. doi:10.21031/epod.790449

Fleishman's Power Method Transformation Constants

Description

This table includes Fleishman's Power Method Transformation constants.

Usage

constants_table
constants_table

Format

A data.frame with 5 columns, which are

Skew: The skewness value
Kurtosis: The standardized kurtosis value
b: Outcome that is based on Skew,Kurtosis
c: Outcome that is based on Skew,Kurtosis
d: Outcome that is based on Skew,Kurtosis

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Fialkowski, A. C. (2018). SimMultiCorrData: Simulation of Correlated Data with Multiple Variable Types. R package version 0.2.2. Retrieved from https://cran.r-project.org/web/packages/SimMultiCorrData/index.html

Examples


# First 6 rows of the table
data(constants_table)
head(constants_table)

# First 6 rows of the table
data(constants_table)
head(constants_table)

Draw Samples with the Desired Properties from a Data Set

Description

A function to sample data with desired properties.

Usage

draw_sample(
  dist,
  n,
  skew,
  kurts,
  replacement = FALSE,
  save.output = FALSE,
  output_name = c("sample", "default")
)
draw_sample(
  dist,
  n,
  skew,
  kurts,
  replacement = FALSE,
  save.output = FALSE,
  output_name = c("sample", "default")
)

Arguments

`dist`	data frame:consists of id and scores with no missing
`n`	numeric: desired sample size
`skew`	numeric: the skewness value
`kurts`	numeric: the kurtosis value
`replacement`	logical:Sample with or without replacement? (default is FALSE).
`save.output`	logical: should the output be saved into a text file? (default is FALSE).
`output_name`	character: a vector of two components. The first component is the name of the output file, user can change the second component.

Details

The execution of the function may take some time since it tries to obtain the specified value for skewness and kurtosis.

Value

This function returns a list including following:

a matrix: Descriptive statistics of the given data, the reference vector and the sample.
a data frame: The id's and scores of the sample
graph: Histograms for the “data” and the “sample”

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Fialkowski, A. C. (2018). SimMultiCorrData: Simulation of Correlated Data with Multiple #' Variable Types. R package version 0.2.2. Retrieved from https://cran.r-project.org/web/packages/SimMultiCorrData/index.html

Atalay Kabasakal, K. & Gunduz, T. (2020). Drawing a Sample with Desired Properties from Population in R Package “drawsample”.Journal of Measurement and Evaluation in Education and Psychology,11(4),405-429. doi:10.21031/epod.790449

Examples

# Example data provided with package
data(example_data)
# First 6 rows of the example_data
head(example_data)
# Draw a sample based on Score_1(from negatively skewed to normal)
output1 <- draw_sample(dist=example_data[,c(1,2)],n=200,skew = 0,kurts = 0,
save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output1$desc
# First 6 rows of the drawn sample
head(output1$sample)
# Histogram of the given data set and drawn sample
output1$graph
## Not run: 
# Draw a sample based on Score_2 (from negatively skewed to positively skewed)
# draw_sample(dist=example_data[,c(1,3)],n=200,skew = 1,kurts = 1,
# output_name = c("sample", "1"))
# Draw a sample based on Score_2 (from negatively skewed to positively skewed
# with replacement)
# draw_sample(dist=example_data[,c(1,3)],n=200,skew = 0.5,kurts = 0.4,
# replacement=TRUE,output_name = c("sample", "2"))

## End(Not run)
# Example data provided with package
data(example_data)
# First 6 rows of the example_data
head(example_data)
# Draw a sample based on Score_1(from negatively skewed to normal)
output1 <- draw_sample(dist=example_data[,c(1,2)],n=200,skew = 0,kurts = 0,
save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output1$desc
# First 6 rows of the drawn sample
head(output1$sample)
# Histogram of the given data set and drawn sample
output1$graph
## Not run: 
# Draw a sample based on Score_2 (from negatively skewed to positively skewed)
# draw_sample(dist=example_data[,c(1,3)],n=200,skew = 1,kurts = 1,
# output_name = c("sample", "1"))
# Draw a sample based on Score_2 (from negatively skewed to positively skewed
# with replacement)
# draw_sample(dist=example_data[,c(1,3)],n=200,skew = 0.5,kurts = 0.4,
# replacement=TRUE,output_name = c("sample", "2"))

## End(Not run)

Sample data with individual responses

Description

A Function to sample data close to desired characteristics with individual responses.

Usage

draw_sample_ir(
  dist,
  n,
  skew,
  kurts,
  replacement = FALSE,
  col_id = 1,
  col_total = numeric(),
  save.output = FALSE,
  output_name = c("sample", "1")
)
draw_sample_ir(
  dist,
  n,
  skew,
  kurts,
  replacement = FALSE,
  col_id = 1,
  col_total = numeric(),
  save.output = FALSE,
  output_name = c("sample", "1")
)

Arguments

`dist`	data frame:consists of id and scores with no missing
`n`	numeric: desired sample size
`skew`	numeric: the skewness value
`kurts`	numeric: the kurtosis value
`replacement`	logical:Sample with or without replacement? (default is FALSE).
`col_id`	index of column ID's
`col_total`	index of column total score
`save.output`	logical: should the output be saved into a text file? (Default is FALSE).
`output_name`	character: a vector of two components. The first component is the name of the output file, user can change the second component.

Details

The execution of the function may take some time since it tries to obtain the specified value for skewness and kurtosis.

Value

This function returns a list including following:

a matrix: Descriptive statistics of the given data, the reference vector and the sample.
a data frame: The id's and individual response of the sample.
graph: Histograms for the “data” and the “sample”

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Atalay Kabasakal, K. & Gunduz, T. (2020). Drawing a Sample with Desired Properties from Population in R Package “drawsample”.Journal of Measurement and Evaluation in Education and Psychology,11(4),405-429. doi:10.21031/epod.790449

Examples

## Not run: 
# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw a sample based on total(from flattened to normal)
output3 <- draw_sample_ir(dist=likert_example,n=200,skew = 1,kurts = 1.2,
col_id=1,col_total=7,save.output = FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output3$desc
# First 6 rows of the drawn sample
head(output3$sample)
# Histogram of the given data set and drawn sample
output3$graph
# Draw a sample based on total(from flattened to normal)
draw_sample_ir(dist=likert_example,n=200,skew = 0.5,kurts =0.5,
col_id=1,col_total=7,save.output = TRUE,
output_name = c("sample", "3"))

## End(Not run)
## Not run: 
# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw a sample based on total(from flattened to normal)
output3 <- draw_sample_ir(dist=likert_example,n=200,skew = 1,kurts = 1.2,
col_id=1,col_total=7,save.output = FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output3$desc
# First 6 rows of the drawn sample
head(output3$sample)
# Histogram of the given data set and drawn sample
output3$graph
# Draw a sample based on total(from flattened to normal)
draw_sample_ir(dist=likert_example,n=200,skew = 0.5,kurts =0.5,
col_id=1,col_total=7,save.output = TRUE,
output_name = c("sample", "3"))

## End(Not run)

Sample data close to desired characteristics - nearest

Description

A Function to sample data close to desired characteristics - nearest

Usage

draw_sample_n(
  dist,
  n,
  skew,
  kurts,
  location = 0,
  delta_var = 0,
  save.output = FALSE,
  output_name = c("sample", "default")
)
draw_sample_n(
  dist,
  n,
  skew,
  kurts,
  location = 0,
  delta_var = 0,
  save.output = FALSE,
  output_name = c("sample", "default")
)

Arguments

`dist`	data frame:consists of id and scores with no missing
`n`	numeric: desired sample size
`skew`	numeric: the skewness value
`kurts`	numeric: the kurtosis value
`location`	numeric: the value for adjusting mean (default is 0).
`delta_var`	numeric: the value for adjusting variance (default is 0).
`save.output`	logical: should the output be saved into a text file? (Default is FALSE).
`output_name`	character: a vector of two components. The first component is the name of the output file, user can change the second component.

Details

The desired skewness and kurtosis values cannot be met while the function execution is faster. The attributes of kurtosis are in doubt. This is because the range of kurtosis is greater than the skewness. For location values can be entered to position the midpoint or mean of the distribution differently. For delta_var the value can be entered for how much will increase or decrease the variability of reference distribution. In other words, the reference distribution is generated as the standard normal distribution, unless the user changes the default values of the location and delta_var arguments.

Value

This function returns a list including following:

a matrix: Descriptive statistics of the given data, the reference vector and the sample.
a data frame: The id's and scores of the sample
graph: Histograms for the “data” and the “sample”

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Examples

# Example data provided with package
data(example_data)
# Draw a sample based on Score_1
output2 <- draw_sample_n(dist=example_data[,c(1,2)],n=200,skew = 0,
kurts = 0, location=0, delta_var=0,save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output2$desc
# First 6 rows of the drawn sample
head(output2$sample)
# Histogram of the given data set and drawn sample
output2$graph
## Not run: 
# Draw a sample based on Score_2 (location par)
# draw_sample_n(dist=example_data[,c(1,3)],n=200,skew = 1,kurts = 1,location=-0.5,delta_var=0,
# save.output=TRUE, output_name = c("sample", "2"))
# Draw a sample based on Score_2 (delta_var par)
# draw_sample_n(dist=example_data[,c(1,3)],n=200,skew = 0.5,kurts = 0.4,location=0,delta_var=0.3,
# save.output=TRUE, output_name = c("sample", "3"))

## End(Not run)
# Example data provided with package
data(example_data)
# Draw a sample based on Score_1
output2 <- draw_sample_n(dist=example_data[,c(1,2)],n=200,skew = 0,
kurts = 0, location=0, delta_var=0,save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output2$desc
# First 6 rows of the drawn sample
head(output2$sample)
# Histogram of the given data set and drawn sample
output2$graph
## Not run: 
# Draw a sample based on Score_2 (location par)
# draw_sample_n(dist=example_data[,c(1,3)],n=200,skew = 1,kurts = 1,location=-0.5,delta_var=0,
# save.output=TRUE, output_name = c("sample", "2"))
# Draw a sample based on Score_2 (delta_var par)
# draw_sample_n(dist=example_data[,c(1,3)],n=200,skew = 0.5,kurts = 0.4,location=0,delta_var=0.3,
# save.output=TRUE, output_name = c("sample", "3"))

## End(Not run)

Sample data close to desired characteristics with individual responses - nearest

Description

A function to sample data with desired properties.

Usage

draw_sample_n_ir(
  dist,
  n,
  skew,
  kurts,
  location = 0,
  delta_var = 0,
  col_id = 1,
  col_total = numeric(),
  save.output = FALSE,
  output_name = c("sample", "default")
)
draw_sample_n_ir(
  dist,
  n,
  skew,
  kurts,
  location = 0,
  delta_var = 0,
  col_id = 1,
  col_total = numeric(),
  save.output = FALSE,
  output_name = c("sample", "default")
)

Arguments

`dist`	data frame:consists of id and scores with no missing
`n`	numeric: desired sample size
`skew`	numeric: the skewness value
`kurts`	numeric: the kurtosis value
`location`	numeric: the value for adjusting mean (default is 0).
`delta_var`	numeric: the value for adjusting variance (default is 0).
`col_id`	index of column ID's
`col_total`	index of column total score
`save.output`	logical: should the output be saved into a text file? (Default is FALSE).
`output_name`	character: a vector of two components. The first component is the name of the output file, user can change the second component.

Details

Value

This function returns a list including following:

a matrix: Descriptive statistics of the given data, the reference vector and the sample.
a data frame: The id's and scores of the sample
graph: Histograms for the “data” and the “sample”

References

Fleishman AI (1978). A Method for Simulating Non-normal Distributions. Psychometrika, 43, 521-532. doi:10.1007/BF02293811.

Atalay Kabasakal, K. & Gunduz, T. (2020). Drawing a Sample with Desired Properties from Population in R Package “drawsample”.Journal of Measurement and Evaluation in Education and Psychology,11(4),405-429. doi:10.21031/epod.790449

Examples

# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw a sample based on Score_1(from negatively skewed to normal)
output4 <- draw_sample_n_ir(dist=likert_example,n=200,skew = 0,kurts = 0,
location= 0,delta_var = 0,
col_id=1,col_total=7,save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output4$desc
# First 6 rows of the drawn sample
head(output4$sample)
# Histogram of the given data set and drawn sample
output4$graph
## Not run: 
output4 <- draw_sample_n_ir(dist=likert_example,n=200,skew = 0.5,kurts = 0.5,
location= 0,delta_var = 0,
col_id=1,col_total=7,save.output=TRUE,
output_name = c("sample", "1")) 

## End(Not run)
# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw a sample based on Score_1(from negatively skewed to normal)
output4 <- draw_sample_n_ir(dist=likert_example,n=200,skew = 0,kurts = 0,
location= 0,delta_var = 0,
col_id=1,col_total=7,save.output=FALSE) # Histogram of the reference data set
# descriptive statistics of the given data,reference data, and drawn sample
output4$desc
# First 6 rows of the drawn sample
head(output4$sample)
# Histogram of the given data set and drawn sample
output4$graph
## Not run: 
output4 <- draw_sample_n_ir(dist=likert_example,n=200,skew = 0.5,kurts = 0.5,
location= 0,delta_var = 0,
col_id=1,col_total=7,save.output=TRUE,
output_name = c("sample", "1")) 

## End(Not run)

Multiple Sample Selection

Description

Multiple Sample Selection

Usage

draw_sample_rep(
  dist,
  n,
  rep = 1,
  skew,
  kurts,
  replacement = TRUE,
  col_id = 1,
  col_total = numeric(),
  exact = FALSE
)
draw_sample_rep(
  dist,
  n,
  rep = 1,
  skew,
  kurts,
  replacement = TRUE,
  col_id = 1,
  col_total = numeric(),
  exact = FALSE
)

Arguments

`dist`	data frame:consists of id and scores with no missing
`n`	numeric: desired sample size
`rep`	numeric: replication
`skew`	numeric: the skewness value
`kurts`	numeric: the kurtosis value
`replacement`	logical:Sample with or without replacement? (default is FALSE).
`col_id`	index of column ID's
`col_total`	index of column total score
`exact`	default is FALSE conduct draw_sample_n_ir function, it is faster and nearest version of draw_sample_ir function.

Value

This function returns a list including following:

a matrix: Descriptive statistics of the given data, the reference vector and the sample.
a data frame: The id's and scores of the sample
graph: Histograms for the “data” and the “sample”

Examples

# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw three samples based on Score_1(from negatively skewed to normal)
# This example takes considerable computation time.
samples <- draw_sample_rep(dist=likert_example,n=200,rep=3,skew=0,
kurts=0,replacement =TRUE,  col_id = 1,
col_total = numeric(),
exact = FALSE)
# to get first sample
samples$sample[[1]]
# to get second sample
samples$sample[[2]]
## Not run: 
# to export 10 samples
for(i in 1:3){
 write.csv(samples$sample[[i]],row.names = FALSE,paste("sample_",i,".csv",sep=""))
 }

## End(Not run)
# Example data provided with package
data(likert_example)
# First 6 rows of the example_data
head(likert_example)
# Draw three samples based on Score_1(from negatively skewed to normal)
# This example takes considerable computation time.
samples <- draw_sample_rep(dist=likert_example,n=200,rep=3,skew=0,
kurts=0,replacement =TRUE,  col_id = 1,
col_total = numeric(),
exact = FALSE)
# to get first sample
samples$sample[[1]]
# to get second sample
samples$sample[[2]]
## Not run: 
# to export 10 samples
for(i in 1:3){
 write.csv(samples$sample[[i]],row.names = FALSE,paste("sample_",i,".csv",sep=""))
 }

## End(Not run)

Draw Samples with a Shiny Applications

Description

Performing package functions with user friendly 'shiny' interface.

Usage

draw_sample_shiny()
draw_sample_shiny()

Examples

## Not run: 
# if(interactive()){
## Run this code for launching the 'shiny' application
#  draw_sample_shiny()
#  }
# 
## End(Not run)
## Not run: 
# if(interactive()){
## Run this code for launching the 'shiny' application
#  draw_sample_shiny()
#  }
# 
## End(Not run)

Example Data

Description

The example data set is made of 500 subjects ids and total scores from two different tests.

Usage

data(example_data)
data(example_data)

Format

A data.frame with 3 columns, which are

ID: students' id
Score_1: Scores of test 1
Score_2: Scores of test 2

Examples

# First 6 rows of the example_data
data(example_data)
head(example_data)
# First 6 rows of the example_data
data(example_data)
head(example_data)

Likert Example Data

Description

The example data set is made of 6669 subjects, 7 variables

Usage

data(likert_example)
data(likert_example)

Format

A data.frame with 7 columns, which are

CNTSTUID: country ID
ST160Q01IA: response of item_1
ST160Q02IA: response of item_2
ST160Q03IA: response of item_3
ST160Q04IA: response of item_4
ST160Q05IA: response of item_5
total: total_score of five items

Examples

# First 6 rows of the likert_example
data(likert_example)
head(likert_example)
# First 6 rows of the likert_example
data(likert_example)
head(likert_example)

Package 'drawsample'

Help Index

Draw Samples with the Desired Properties from a Data Set

Description

Author(s)

References

See Also

Fleishman's Power Method Transformation Constants

Description

Usage

Format

References

See Also

Examples

Draw Samples with the Desired Properties from a Data Set

Description

Usage

Arguments

Details

Value

References

Examples

Sample data with individual responses

Description

Usage

Arguments

Details

Value

References

Examples

Sample data close to desired characteristics - nearest

Description

Usage

Arguments

Details

Value

References

Examples

Sample data close to desired characteristics with individual responses - nearest

Description

Usage

Arguments

Details

Value

References

Examples

Multiple Sample Selection

Description

Usage

Arguments

Value

Examples

Draw Samples with a Shiny Applications

Description

Usage

Examples

Example Data

Description

Usage

Format

Examples

Likert Example Data

Description

Usage

Format

Examples