Package 'DyadRatios'

Title: Dyad Ratios Algorithm
Description: Estimates the Dyad Ratios Algorithm for pooling and smoothing poll estimates. The Dyad Ratios Algorithm smooths both forward and backward in time over polling results allowing differences in both question type and polling house. The result is an estimate of a single latent variable that describes the systematic trend over time in the (noisy) polling results. See James A. Stimson (2018) <doi:10.1177/0759106318761614> and the package's vignette for more details.
Authors: Dave Armstrong [cre], James Stimson [aut]
Maintainer: Dave Armstrong <[email protected]>
License: GPL (>= 2)
Version: 1.4
Built: 2026-05-08 09:08:50 UTC
Source: https://github.com/cran/DyadRatios

Help Index


Dyad Ratios Algorithm

Description

Estimates the Dyad Ratios Algorithm for constructing latent time series from survey-research marginals.

Arguments

varname

String giving the name of the input series to be smoothed. This should identify similar or comparable values in the series. Values in the series that have the same varname will be assumed to come from the same source.

date

ISO numeric representation of the date the survey was in the field (usually the start, end, or median date).

index

Numeric value of the series. It might be a percent or proportion responding in a single category (e.g., the approve response in presidential approval) or some multi-response summary. For ease of interpretation, polarity should be the same for all items.

ncases

Number of cases (e.g., sample size) of the survey. This provides differential weighting for the values. Setting this to NULL or leaving it blank will weight each value equally.

unit

Aggregation period—one of ‘D’ (daily), ‘M’ (monthly), ‘Q’ (quarterly), ‘A’ (annual), or ‘O’ (multi-year aggregation).

mult

Number of years, only used if unit is ‘O’.

begindt

Beginning date of the analysis. Defaults to earliest date in the dataset. Should be specified with lubridate::ymd().

enddt

Ending date for the analysis. Defaults to the latest date in the data.

npass

Not yet implemented.

smoothing

Logical. Specifies whether exponential smoothing is applied to the intermediate estimates during the iterative solution process. Defaults to TRUE.

endmonth

Ending month of the analysis.

R

Number of bootstrap samples.

parallel

Logical indicating whether the 'mclapply' function should be used.

level

The confidence level for the intervals. Default is 0.95.

pw

Logical indicating whether to do pairwise tests.

...

Other arguments to be passed down to 'mclapply'.

Value

A list with potentially two data frames 'ci' has variables:

  • period: Aggregation period.

  • latent1: Estimate of latent variable from original analysis.

  • lwr: Lower confidence bound.

  • upr: Upper confidence bound.

If 'pw = TRUE', the list also contains 'pw' with variables:

  • p1: Earlier period

  • p2: Later period

  • diff: (mood for p2) - (mood for p1)

  • p_diff: Probability that the larger mood is bigger than the smaller mood.

References

Stimson, J. A. (2018). ‘The Dyad Ratios Algorithm for Estimating Latent Public Opinion: Estimation, Testing, and Comparison to Other Approaches’, Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 137–138(1), 201–218. doi:10.1177/0759106318761614

Examples

data(jennings)
# R should be higher for real-world applications
## Not run: 
boot_out <- boot_dr(varname = jennings$variable, 
                  date = jennings$date, 
                  index = jennings$value, 
                  ncases = jennings$n, 
                  begindt = as.Date("1985-01-01"), 
                  enddt = max(jennings$date), 
                  npass=1, R=1000, 
                  parallel=FALSE)
boot_out

## End(Not run)

Dyad Ratios Algorithm

Description

Estimates the Dyad Ratios Algorithm for constructing latent time series from survey-research marginals.

Arguments

varname

String giving the name of the input series to be smoothed. This should identify similar or comparable values in the series. Values in the series that have the same varname will be assumed to come from the same source.

date

ISO numeric representation of the date the survey was in the field (usually the start, end, or median date).

index

Numeric value of the series. It might be a percent or proportion responding in a single category (e.g., the approve response in presidential approval) or some multi-response summary. For ease of interpretation, polarity should be the same for all items.

ncases

Number of cases (e.g., sample size) of the survey. This provides differential weighting for the values. Setting this to NULL or leaving it blank will weight each value equally.

unit

Aggregation period—one of ‘D’ (daily), ‘M’ (monthly), ‘Q’ (quarterly), ‘A’ (annual), or ‘O’ (multi-year aggregation).

mult

Number of years, only used if unit is ‘O’.

begindt

Beginning date of the analysis. Defaults to earliest date in the dataset. Should be specified with lubridate::ymd().

enddt

Ending date for the analysis. Defaults to the latest date in the data.

npass

Not yet implemented.

smoothing

Logical. Specifies whether exponential smoothing is applied to the intermediate estimates during the iterative solution process. Defaults to TRUE.

endmonth

Ending month of the analysis.

Details

In previous versions of the algorithm, especially in its original incarnation as 'Wcalc.exe', getting the dates right was really important. They needed to be in precisely the right format and the data needed to be ordered by date. In this version of the 'extract()' function, any R date class is fine whether it is a 'Date' class or 'POSIX*' class. The function uses 'is.Date' from the ‘lubridate' package to check if it is a 'Date' class and 'as.Date()' from R’s 'base' package to coerce it to a 'Date' class if it is not. Further, the series are ordered by date in the function as a matter of course, so it is not necessary that they be in ascending ordered of date in the input data.

Value

A list with components:

  • call: The initial call to 'extract()'.

  • T: Number of aggregation periods.

  • nvar: Number of series used in the analysis.

  • unit: The aggregation period.

  • dimensions: Number of dimensions estimated (1 or 2).

  • period: List of aggregation periods.

  • varname: List in order of the variables used in the analysis.

  • N: Number of non-missing observations for each series.

  • wtdmean: The weighted mean of the input variables.

  • wtstd: The weighted standard deviation of the input series.

  • means: Item descriptive information.

  • std.deviations: Item descriptive information.

  • setup1: Basic information about the options and iterative solution for the first dimension.

  • setup2: Basic information about the options and iterative solution for the second dimension, if applicable.

  • loadings1: Item–scale correlations from the first dimension of the final solution. Their square is the validity estimate used in weighting.

  • loadings2: Item–scale correlations from the second dimension of the final solution, if applicable.

  • latent1: Estimated time series for first dimension.

  • latent2: Estimated time series for second dimension, if applicable.

  • hist: Data frame with iteration history, including convergence statistics.

  • totalvar: Total variance in the data that could be explained by the estimated dimensions.

  • var_exp: Data frame with variance explained and proportion of variance explained for each dimension.

  • dropped: Data frame listing any series that were dropped from the analysis due to insufficient data.

  • smoothing: Logical indicating whether smoothing was applied during the iterative solution process.

  • ms_list: List of means and variances for standardizing the input series for the bootstrap.

References

Stimson, J. A. (2018). ‘The Dyad Ratios Algorithm for Estimating Latent Public Opinion: Estimation, Testing, and Comparison to Other Approaches’, Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 137–138(1), 201–218. doi:10.1177/0759106318761614

Examples

data(jennings)
dr_out <- extract(varname = jennings$variable, 
                  date = jennings$date, 
                  index = jennings$value, 
                  ncases = jennings$n, 
                  begindt = min(jennings$date), 
                  enddt = max(jennings$date), 
                  npass=1)
summary(dr_out)

Get Mood Estimates from Dyad Ratios Algorithm

Description

Retrieve the latent variable estimates from the dyad ratios algorithm produced by the 'extract()' function.

Arguments

out

An object of class "extract", typically a result of the 'extract()' function.

...

Other arguments to be passed down (currently unused).

Value

A data frame containing the period of aggregation and the corresponding latent variable estimates for each dimension.


Jennings Government Trust Data

Description

A dataset of survey marginals from the British Social Attitudes (BSA) survey, measuring public trust in government. These marginals are commonly used as input to the Dyad Ratios Algorithm for constructing latent time series. We replaced missing sample sizes with a value of 850, which is roughly the minimum sample size observed in the data.

Format

A data frame with 4 variables and 'r nrow(jennings)' rows:

variable

Character string identifying the survey question or series.

date

Date the survey was fielded.

value

percentage of people indicating distrust in the government.

n

Sample size for the survey wave.

Source

Jennings, W. N. Clarke, J. Moss and G. Stoker (2017). "The Decline in Diffuse Support for National Politics: The Long View on Political Discontent in Britain" In *Public Opinion Quarterly*, 81(3), 748-758. doi:10.1093/poq/nfx020

Examples

data(jennings)
head(jennings)

Plot Method for the extract Function.

Description

This function generates a line plot of the latent variable estimates obtained from the 'extract()' function. It can handle both one-dimensional and two-dimensional latent variable estimates.

Arguments

x

An object of class "extract", typically a result of the 'extract()' function.

...

Additional graphical parameters to be passed to the plot function.

Value

A ggplot of the latent variable estimate(s) over time.

Examples

data(jennings)
dr_out <- extract(varname = jennings$variable, 
                  date = jennings$date, 
                  index = jennings$value, 
                  ncases = jennings$n, 
                  begindt = min(jennings$date), 
                  enddt = max(jennings$date), 
                  npass=1)
plot(dr_out)

Print method for extract function

Description

Print method for extract function

Arguments

x

An object of class "extract", typically a result of the extract function.

...

Other arguments to be pased down

Value

A printed summary of the output from the extract function and the input object returned invisbly.

Examples

data(jennings)
dr_out <- extract(varname = jennings$variable, 
                  date = jennings$date, 
                  index = jennings$value, 
                  ncases = jennings$n, 
                  begindt = min(jennings$date), 
                  enddt = max(jennings$date), 
                  npass=1)
print(dr_out)

Summary Method for extract Objects

Description

Prints a summary from objects estimated with the the extract function.

Arguments

object

An object of class "extract", typically a result of the extract function.

...

Other arguments to be passed down (currently unused).

Value

A data frame with variables for the question name, number of cases, loading as well as mean and standard deviation of the series.

Examples

data(jennings)
dr_out <- extract(varname = jennings$variable, 
                  date = jennings$date, 
                  index = jennings$value, 
                  ncases = jennings$n, 
                  begindt = min(jennings$date), 
                  enddt = max(jennings$date), 
                  npass=1)
summary(dr_out)