Package 'SDAResources' reference manual

Title:	Datasets and Functions for 'Sampling: Design and Analysis, 3rd Edition'
Description:	Includes all the datasets of 'Sampling: Design and Analysis' (3rd edition by Sharon Lohr) in R format and additional functions for analyzing and graphing probability samples.
Authors:	Yan Lu [aut, cre], Sharon Lohr [aut]
Maintainer:	Yan Lu <yanlu@unm.edu>
License:	GPL-2 \| GPL-3
Version:	0.1.1
Built:	2025-03-13 06:42:16 UTC
Source:	CRAN

agpop data

Description

Data from the 1992 U.S. Census of Agriculture.

Usage

data(agpop)
data(agpop)

Format

This data frame contains the following columns:

county:: county name (character variable)
state:: state abbreviation (character variable)
acres92:: number of acres devoted to farms, 1992
acres87:: number of acres devoted to farms, 1987
acres82:: number of acres devoted to farms, 1982
farms92:: number of farms, 1992
farms87:: number of farms, 1987
farms82:: number of farms, 1982
largef92:: number of farms with 1,000 acres or more, 1992
largef87:: number of farms with 1,000 acres or more, 1987
largef82:: number of farms with 1,000 acres or more, 1982
smallf92:: number of farms with 9 acres or fewer, 1992
smallf87:: number of farms with 9 acres or fewer, 1987
smallf82:: number of farms with 9 acres or fewer, 1982
region:: S = south; W = west; NC = north central; NE = northeast

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

agpps data

Description

Data from a without-replacement probability-proportional-to-size sample from agpop data.

Usage

data(agpps)
data(agpps)

Format

This data frame contains the following columns:

county:: county name (character variable)
state:: state abbreviation (character variable)
acres92:: number of acres devoted to farms, 1992
acres87:: number of acres devoted to farms, 1987
acres82:: number of acres devoted to farms, 1982
farms92:: number of farms, 1992
farms87:: number of farms, 1987
farms82:: number of farms, 1982
largef92:: number of farms with 1,000 acres or more, 1992
largef87:: number of farms with 1,000 acres or more, 1987
largef82:: number of farms with 1,000 acres or more, 1982
smallf92:: number of farms with 9 acres or fewer, 1992
smallf87:: number of farms with 9 acres or fewer, 1987
smallf82:: number of farms with 9 acres or fewer, 1982
region:: S = south; W = west; NC = north central; NE = northeast
sizemeas:: size measure used to select the pps sample
SelectionProb:: inclusion probability for county
SamplingWeight:: sampling weight for county
Unit:: unit number for indexing joint inclusion probabilities
JtProb_1:: columns of joint inclusion probabilities
JtProb_2:: columns of joint inclusion probabilities
JtProb_3:: columns of joint inclusion probabilities
JtProb_4:: columns of joint inclusion probabilities
JtProb_5:: columns of joint inclusion probabilities
JtProb_6:: columns of joint inclusion probabilities
JtProb_7:: columns of joint inclusion probabilities
JtProb_8:: columns of joint inclusion probabilities
JtProb_9:: columns of joint inclusion probabilities
JtProb_10:: columns of joint inclusion probabilities
JtProb_11:: columns of joint inclusion probabilities
JtProb_12:: columns of joint inclusion probabilities
JtProb_13:: columns of joint inclusion probabilities
JtProb_14:: columns of joint inclusion probabilities
JtProb_15:: columns of joint inclusion probabilities

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

agsrs data

Description

Data from an SRS of size 300 from the 1992 U.S. Census of Agriculture agpop data.

Usage

data(agsrs)
data(agsrs)

Format

Variables are the same as in agpop data.

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

agstrat data

Description

Data from a stratified random sample of size 300 from the 1992 U.S. Census of Agriculture agpop data.

Usage

data(agstrat)
data(agstrat)

Format

This data frame contains the following columns:

county:: county name (character variable)
state:: state abbreviation (character variable)
acres92:: number of acres devoted to farms, 1992
acres87:: number of acres devoted to farms, 1987
acres82:: number of acres devoted to farms, 1982
farms92:: number of farms, 1992
farms87:: number of farms, 1987
farms82:: number of farms, 1982
largef92:: number of farms with 1,000 acres or more, 1992
largef87:: number of farms with 1,000 acres or more, 1987
largef82:: number of farms with 1,000 acres or more, 1982
smallf92:: number of farms with 9 acres or fewer, 1992
smallf87:: number of farms with 9 acres or fewer, 1987
smallf82:: number of farms with 9 acres or fewer, 1982
region:: S = south; W = west; NC = north central; NE = northeast
rn:: random numbers used to select sample in each stratum
strwt:: sampling weight for each county in sample

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

algebra data

Description

Fictional data for an SRS of 12 algebra classes in a city, from a population of 187 classes.

Usage

data(algebra)
data(algebra)

Format

This data frame contains the following columns:

class:: class number
Mi:: number of students $M_i$ in class
score:: score of student on test

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

anthrop data

Description

Finger length and height for 3,000 criminals. This data set contains information for the entire population.

Usage

data(anthrop)
data(anthrop)

Format

This data frame contains the following columns:

finger:: length of left middle finger (cm)
height:: height (inches)

References

Macdonell, W. R. (1901). On criminal anthropometry and the identification of criminals. Biometrika 1, 177–227.

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

anthsrs data

Description

Length of left middle finger and height for an SRS of size 200 from anthrop data.

Usage

data(anthsrs)
data(anthsrs)

Format

This data frame contains the following columns:

finger:: length of left middle finger (cm)
height:: height (inches)
wt:: sampling weight

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

Finger length and height for a with replacement unequal probability sample of size 200 from data anthrop. The probability of selection, $\psi_i$ , was proportional to 24 for y < 65 , 12 for y = 65, 2 for y = 66 or 67, and 1 for y > 67.

Usage

data(anthuneq)
data(anthuneq)

Format

This data frame contains the following columns:

finger:: length of left middle finger (cm)
height:: height (inches)
wt:: sampling weight

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

artifratio data

Description

Values from all possible SRSs for an artificial population in Chapter 4 of SDA.

Usage

data(artifratio)
data(artifratio)

Format

This data frame contains the following columns:

sample:: sample number
i1:: first unit in sample
i2:: second unit in sample
i3:: third unit in sample
i4:: fourth unit in sample
xbars:: $\bar{x}_s$
ybars:: $\bar{y}_s$
bhat:: $\widehat{B}$
tSRS:: $\widehat{t}_{y,srs}=N*\bar{y}_s$
thatr:: $\widehat{t}_{yr}$

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

asafellow data

Description

Information from a stratified random sample of Fellows of the American Statistical Association elected between 2000 and 2018. The list of Fellows serving as the population was downloaded from amstat on March 18, 2019. All other information was obtained from public sources.

Usage

data(asafellow)
data(asafellow)

Format

This data frame contains the following columns:

awardyr:

year of award

gender:

gender of Fellow (character variable, M = male, F = female)

popsize:

population size in stratum ( = $N_h$ )

sampsize:

sample size in stratum ( = $n_h$ )

field:

field of employment (character variable)

acad = academia

ind = industry

govt = government

degreeyr:

year in which Fellow received terminal degree (year of Ph.D. if applicable, otherwise year of Master's or Bachelor's degree)

math:

= 1 if majored in mathematics as undergraduate

= 0 if did not major in math

= NA if missing

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

auditresult data

Description

Audit data used in Chapter 6 of SDA.

Usage

data(auditresult)
data(auditresult)

Format

This data frame contains the following columns:

account:: audit unit
bookvalue:: book value of account
psi:: probability of selection
auditvalue:: audit value of account

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

auditselect data

Description

Selection of accounts for audit data used in Chapter 6 of SDA.

Usage

data(auditselect)
data(auditselect)

Format

This data frame contains the following columns:

account:: audit unit
bookval:: book value of account
cumbv:: cumulative book value
rn1:: random number 1 selecting account
rn2:: random number 2 selecting account
rn3:: random number 3 selecting account

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

azcounties data

Description

Population and housing unit estimates for Arizona counties, excluding Maricopa and Pima counties, from the American Community Survey 2018 5-year estimates.

Usage

data(azcounties)
data(azcounties)

Format

This data frame contains the following columns:

name:: county name (character variable, length 15)
number:: county number
population:: population estimate for county
housing:: housing unit estimate for county
ownerocc:: number of owner-occupied housing units for county

References

Source: https://data.census.gov/, accessed November 27, 2020.

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

baseball data

Description

Statistics on 797 baseball players, compiled by Jenifer Boshes from the rosters of all major league teams in November 2004. Missing values (for variables pball, intwalk, hbp, sacrfly; all other variables have complete data) are coded as NA.

Usage

data(baseball)
data(baseball)

Format

This data frame contains the following columns:

team:: team played for at the beginning of the season
leagueid:: AL or NL
player:: a unique identifier for each baseball player
salary:: player salary in 2004
pos:: primary position coded as P, C, 1B, 2B, 3B, SS, RF, LF, or CF
gplay:: games played
gstart:: games started
inning:: number of innings
putout:: number of putouts
assist:: number of assists
error:: errors
dplay:: number of double plays
pball:: number of passed balls (only applies to catchers)
gbat:: number of games that player appeared at bat
atbat:: number of at bats
run:: number of runs scored
hit:: number of hits
secbase:: number of doubles
thirdbase:: number of triples
homerun:: number of home runs
rbi:: number of runs batted in
stolenb:: number of stolen bases
csteal:: number of times caught stealing
walk:: number of times walked
strikeout:: number of strikeouts
intwalk:: number of times intentionally walked
hbp:: number of times hit by pitch
sacrhit:: number of sacrifice hits
sacrfly:: number of sacrifice flies
gidplay:: grounded into double play

References

Forman, S. L. (2004). Baseball-reference.com—Major league statistics and information. www.baseball-reference.com (accessed November 2004).

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

books data

Description

Data from homeowner's survey to estimate total number of books, used in Chapter 5.

Usage

data(books)
data(books)

Format

This data frame contains the following columns:

shelf:: shelf number
Mi:: number of books on shelf
booknumber:: number of the book selected
purchase:: purchase cost of book
replace:: replacement cost of book

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

Capture-recapture confidence interval function

Description

Compute a confidence interval for a capture-recapture sample using the method of Cormack (1992).

Usage

captureci(xmat, y, alpha)
captureci(xmat, y, alpha)

Arguments

`xmat`	Define 1 = in sample and 0 = not in sample. For example, if there are two samples, xmat has two columns; the row (1,0) represents the category of being in sample 1 but not in sample 2.
`y`	Number of units corresponding to xmat.
`alpha`	Confidence level with a default value of 0.05.

Value

cell: estimated cell value for the missing count of category (0, 0)

N: the estimated total counts

CI_cell: the estimated confidence interval for the missing category count

CI_N: the estimated confidence interval for total counts

Examples

xmat <- cbind(c(1,1,0),c(1,0,1))
y <- c(20,180,80)
captureci(xmat, y, alpha = 0.1)
xmat <- cbind(c(1,1,0),c(1,0,1))
y <- c(20,180,80)
captureci(xmat, y, alpha = 0.1)

census1920 data

Description

Population sizes for each state, from the 1920 U.S. census. The data set contains only the 48 states and excludes Washington D.C., Puerto Rico, and U.S. territories (these areas were not allowed to have voting representatives in Congress).

Usage

data(census1920)
data(census1920)

Format

This data frame contains the following columns:

state:: state name
population:: state population in 1920 census

References

Source: U.S. Bureau of the Census (1921).

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

census2010 data

Description

Population sizes for each state, from the 2010 U.S. census. The data set contains only the 50 states and excludes the areas that, as of 2020, are not allowed to have voting representatives in Congress: Washington D.C., Puerto Rico, and U.S. territories.

Usage

data(census2010)
data(census2010)

Format

This data frame contains the following columns:

state:: state name
population:: state population in 2010 census

References

Source: U.S. Census Bureau (2019).

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

cherry data

Description

Data for a sample of 31 cherry trees.

Usage

data(cherry)
data(cherry)

Format

This data frame contains the following columns:

diameter:: diameter of tree (inches)
height:: height of tree (feet)
volume:: timber volume of tree (cubic feet)

References

Hand, D. J., F. Daly, A. D. Lunn, K. J. McConway, and E. Ostrowski (1994). A Handbook of Small Data Sets. London: Chapman and Hall.

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

classes data

Description

Population sizes for 15 classes, used in Chapter 6 of SDA to illustrate unequal-probability sampling.

Usage

data(classes)
data(classes)

Format

This data frame contains the following columns:

class:: class ID number
class_size:: number of students in class

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

classpps data

Description

Two-stage unequal-probability sample without replacement from the population of classes in classes data.

Usage

data(classpps)
data(classpps)

Format

This data frame contains the following columns:

class:: class ID number
class_size:: number of students in class
finalweight:: sampling weight for student
hours:: number of hours spent studying statistics

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

classppsjp data

Description

Joint inclusion probabilities for unequal probability sample without replacement from the population of classes in data classes.

Usage

data(classppsjp)
data(classppsjp)

Format

This data frame contains the following columns:

class:: class ID number
class_size:: number of students in class
SelectionProb:: probability of being included in sample, $\pi_i$
SamplingWeight:: sampling weight $w_i = 1/(\pi_i)$
JtProb_1:: columns of joint inclusion probabilities, $\pi_{1k}$
JtProb_2:: columns of joint inclusion probabilities, $\pi_{2k}$
JtProb_3:: columns of joint inclusion probabilities, $\pi_{3k}$
JtProb_4:: columns of joint inclusion probabilities, $\pi_{4k}$
JtProb_5:: columns of joint inclusion probabilities, $\pi_{5k}$

References

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

college data

Description

Selected variables from the U.S. Department of Education College Scorecard Data (version updated on June 1, 2020). Some of the variables in the book data have been calculated from other variables in the original source; these have been given new variable names that are not found in the data dictionary.

Usage

data(college)
data(college)

Format

This data frame contains the following columns:

unitid:

unit identification number

instnm:

institution name (character, length 81)

city:

city (character, length 24)

stabbr:

state abbreviation (character, length 2)

highdeg:

highest degree awarded

3 = Bachelor's degree

4 = Graduate degree

control:

control (ownership) of institution

1 = public

2 = private nonprofit

region:

region where institution is located

1 New England (CT, ME, MA, NH, RI, VT)

2 Mid East (DE, DC, MD, NJ, NY, PA)

3 Great Lakes (IL, IN, MI, OH, WI)

4 Plains (IA, KS, MN, MO, NE, ND, SD)

5 Southeast (AL, AR, FL, GA, KY, LA, MS, NC, SC, TN, VA, WV)

6 Southwest (AZ, NM, OK, TX)

7 Rocky Mountains (CO, ID, MT, UT, WY)

8 Far West (AK, CA, HI, NV, OR, WA)

locale:

locale of institution

11 City: Large (population of 250,000 or more)

12 City: Midsize (population of at least 100,000 but less than 250,000)

13 City: Small (population less than 100,000)

21 Suburb: Large (outside principal city, in urbanized area with population of 250,000 or more)

22 Suburb: Midsize (outside principal city, in urbanized area with population of at least 100,000 but less than 250,000)

23 Suburb: Small (outside principal city, in urbanized area with population less than 100,000)

31 Town: Fringe (in urban cluster up to 10 miles from an urbanized area)

32 Town: Distant (in urban cluster more than 10 miles and up to 35 miles from an urbanized area)

33 Town: Remote (in urban cluster more than 35 miles from an urbanized area)

41 Rural: Fringe (rural territory up to 5 miles from an urbanized area or up to 2.5 miles from an urban cluster)

42 Rural: Distant (rural territory more than 5 miles but up to 25 miles from an urbanized area or more than 2.5 and up to 10 miles from an urban cluster)

43 Rural: Remote (rural territory more than 25 miles from an urbanized area and more than 10 miles from an urban cluster)

ccbasic:

carnegie basic classification

15 Doctoral Universities: Very High Research Activity

16 Doctoral Universities: High Research Activity

17 Doctoral/Professional Universities

18 Master's Colleges & Universities: Larger Programs

19 Master's Colleges & Universities: Medium Programs

20 Master's Colleges & Universities: Small Programs

21 Baccalaureate Colleges: Arts & Sciences Focus

22 Baccalaureate Colleges: Diverse Fields

ccsizset:

carnegie classification, size and setting

6 Four-year, very small, primarily nonresidential

7 Four-year, very small, primarily residential

8 Four-year, very small, highly residential

9 Four-year, small, primarily nonresidential

10 Four-year, small, primarily residential

11 Four-year, small, highly residential

12 Four-year, medium, primarily nonresidential

13 Four-year, medium, primarily residential

14 Four-year, medium, highly residential

15 Four-year, large, primarily nonresidential

16 Four-year, large, primarily residential

17 Four-year, large, highly residential

hbcu:

historically black college or university

1 = yes, 0 = no

openadmp:

does the college have an open admissions policy, that is, does it accept any students that apply or have minimal requirements for admission?

1 = yes, 0 = no

adm_rate:

fall admissions rate, defined as the number of admitted undergraduates divided by the number of undergraduates who applied

sat_avg:

average SAT score (or equivalent) for admitted students

ugds:

number of degree-seeking undergraduate students enrolled in the fall term

ugds_men:

proportion of ugds who are men

ugds_women:

proportion of ugds who are women

ugds_white:

proportion of ugds who are white (based on self-reports)

ugds_black:

proportion of ugds who are black/African American (based on self-reports)

ugds_hisp:

proportion of ugds who are Hispanic (based on self-reports)

ugds_asian:

proportion of ugds who are Asian (based on self-reports)

ugds_other:

proportion of ugds who have other race/ethnicity (created from other categories on original data file; race/ethnicity proportions sum to 1)

npt4:

average net price of attendance, derived from the full cost of attendance, including tuition and fees, books and supplies, and living expenses, minus federal, state, and institutional grant scholarship aid, for full time, first time undergraduate Title IV receiving students. NPT4 created from scorecard data variables NPT4_PUB if public institution and NPT4_PRIV if private

tuitionfee_in:

in-state tuition and fees

tuitionfee_out:

out-of-state tuition and fees

avgfacsal:

average faculty salary per month

pftfac:

proportion of faculty that is full-time

c150_4:

proportion of first-year, full-time students who complete their degree within 150% of the expected time to complete; for most institutions, this is the proportion of students who receive a degree within 6 years

grads:

number of graduate students

Details

This data set is made available for pedagogical purposes only. Anyone wishing to draw conclusions from College Scorecard data should obtain the full data set from the Department of Education. The original data set has 1,925 variables and includes institutions (such as those that do not grant undergraduate degrees) that are not in the data college.

The college data includes institutions in the original data set that: (1) are located in the 50 states plus District of Columbia, (2) contain information on average net price (NPT4), (3) are predominantly Bachelor's degree-granting, (4) were currently operating as of June 2020, (5) are not private for-profit institutions or "global" campuses, (6) have Carnegie size classification (variable ccsizset) between 6 and 17 and Carnegie basic classification(variable ccbasic) between 14 and 22 (these offer Bachelor's degrees), (7) enrolls first-time students, and (8) are not U.S. Service Academies.

For all variables, missing data are coded as NA.

References

U.S. Department of Education (2020). College scorecard data. https://collegescorecard.ed.gov/data/ (accessed August 25, 2020).

Lohr (2021), Sampling: Design and Analysis, 3rd Edition. Boca Raton, FL: CRC Press.

Lu and Lohr (2021), R Companion for Sampling: Design and Analysis, 3rd Edition, 1st Edition. Boca Raton, FL: CRC Press.

collegerg data

Description

Five replicate SRSs from the set of public colleges and universities (having control = 1) in college data. Columns 1-29 are as in college data, with additional columns 30-32 listed below. Note that the selection probabilities and sampling weights are for the separate replicate samples, so that the weights for each replicate sample sum to the population size 500.

Usage

data(collegerg)
data(collegerg)