| Title: | Estimation of Pleiotropic Heritability from Genome-Wide Association Studies (GWAS) Summary Statistics |
|---|---|
| Description: | Provides tools to compute unbiased pleiotropic heritability estimates of complex diseases from genome-wide association studies (GWAS) summary statistics. We estimate pleiotropic heritability from GWAS summary statistics by estimating the proportion of variance explained from an estimated genetic correlation matrix (Bulik-Sullivan et al. 2015 <doi:10.1038/ng.3406>) and employing a Monte-Carlo bias correction procedure to account for sampling noise in genetic correlation estimates. |
| Authors: | Yujie Zhao [aut, cre] |
| Maintainer: | Yujie Zhao <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.2 |
| Built: | 2026-05-08 08:08:39 UTC |
| Source: | https://github.com/cran/pleioh2g |
Compute a vector of pleioh2g for all diseases before correction This function computes pleioh2g for all diseases before correction in one go.
Cal_cor_pleiotropic_h2(rg_mat, h2g_T)Cal_cor_pleiotropic_h2(rg_mat, h2g_T)
rg_mat |
genetic correlation matrix. |
h2g_T |
heritability vector for all diseases. |
pleioh2g vector
data(Results_full_rg) data(h2_vector) Cal_cor_pleiotropic_h2(Results_full_rg,h2_vector)data(Results_full_rg) data(h2_vector) Cal_cor_pleiotropic_h2(Results_full_rg,h2_vector)
This function computes pleioh2g for the target disease after correction.
Cal_cor_pleiotropic_h2_corrected_single( rg_mat, h2g_T_single, corrected_weight_updated, plei_h2_idx )Cal_cor_pleiotropic_h2_corrected_single( rg_mat, h2g_T_single, corrected_weight_updated, plei_h2_idx )
rg_mat |
genetic correlation matrix. |
h2g_T_single |
heritability for target diseases. |
corrected_weight_updated |
the ratio for correction |
plei_h2_idx |
index of the target disease in the rg_mat. |
pleioh2g value for the target disease after correction
data(Results_full_rg) data(h2_vector) plei_h2_idx<-1 h2g_T_single <- h2_vector[plei_h2_idx] corrected_weight_updated <- 0.78 Cal_cor_pleiotropic_h2_corrected_single(Results_full_rg,h2g_T_single, corrected_weight_updated,plei_h2_idx)data(Results_full_rg) data(h2_vector) plei_h2_idx<-1 h2g_T_single <- h2_vector[plei_h2_idx] corrected_weight_updated <- 0.78 Cal_cor_pleiotropic_h2_corrected_single(Results_full_rg,h2g_T_single, corrected_weight_updated,plei_h2_idx)
This function computes pleioh2g for the target disease before correction.
Cal_cor_pleiotropic_h2_single(rg_mat, h2g_T_single, plei_h2_idx)Cal_cor_pleiotropic_h2_single(rg_mat, h2g_T_single, plei_h2_idx)
rg_mat |
genetic correlation matrix. |
h2g_T_single |
heritability for target diseases. |
plei_h2_idx |
index of the target disease in the rg_mat. |
pleioh2g value for the target disease before correction
data(Results_full_rg) data(h2_vector) plei_h2_idx<-1 h2g_T_single<-h2_vector[plei_h2_idx] Cal_cor_pleiotropic_h2_single(Results_full_rg,h2g_T_single,plei_h2_idx)data(Results_full_rg) data(h2_vector) plei_h2_idx<-1 h2g_T_single<-h2_vector[plei_h2_idx] Cal_cor_pleiotropic_h2_single(Results_full_rg,h2g_T_single,plei_h2_idx)
This function inversed elements for the target disease in bias correction procedure.
Cal_cor_test_single(rg_mat, plei_h2_idx)Cal_cor_test_single(rg_mat, plei_h2_idx)
rg_mat |
genetic correlation matrix. |
plei_h2_idx |
index of the target disease in the rg_mat. |
inverse element value for the target disease used for bias correction
data(Results_full_rg) plei_h2_idx<-1 Cal_cor_test_single(Results_full_rg,plei_h2_idx)data(Results_full_rg) plei_h2_idx<-1 Cal_cor_test_single(Results_full_rg,plei_h2_idx)
This function is used to compute rg + h2g using LDSC.
Cal_rg_h2g_alltraits( phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL )Cal_rg_h2g_alltraits( phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL )
phenotype |
Vector of the phenotype name |
munged_sumstats |
All LDSC-munged GWAS .stat.gz |
ld_path |
Path to directory containing ld score files. |
wld_path |
Path to directory containing weight files. |
sample_prev |
Vector of sample prevalence, in the same order of input GWAS summary statistics. |
population_prev |
Vector of population prevalence, in the same order of input GWAS summary statistics. |
A named list containing LDSC-based heritability and genetic correlation estimates across all input phenotypes. The list includes the following elements:
h2: Matrix of SNP-heritability estimates on the observed scale
(rows = 1, columns = input phenotypes).
h2Z: Matrix of corresponding heritability Z-scores.
liah2: Matrix of heritability estimates on the liability scale.
rg: Symmetric matrix of pairwise genetic correlations between traits.
rgz: Matrix of Z-scores for the genetic correlation estimates.
gcov: Symmetric matrix of genetic covariances between traits.
Each element corresponds to one LDSC-derived summary statistic, with trait names used as both row and column names.
This function performs genomic-block jackknife and computes rg + h2g.
Cal_rg_h2g_jk_alltraits( n_block = 200, hmp3, phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL )Cal_rg_h2g_jk_alltraits( n_block = 200, hmp3, phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL )
n_block |
number of jackknife blocks. |
hmp3 |
Directory for hapmap 3 snplist. |
phenotype |
Vector of the phenotype name |
munged_sumstats |
All LDSC-munged GWAS .stat.gz |
ld_path |
Path to directory containing ld score files. |
wld_path |
Path to directory containing weight files. |
sample_prev |
Vector of sample prevalence, in the same order of input GWAS summary statistics. |
population_prev |
Vector of population prevalence, in the same order of input GWAS summary statistics. |
A named list containing block jackknife estimates of SNP-heritability and genetic correlation across all input phenotypes. The list includes the following elements:
h2array: A matrix of per-block SNP-heritability estimates on the
observed scale. Rows correspond to jackknife blocks, and columns correspond
to input phenotypes.
liah2array: A matrix of per-block SNP-heritability estimates on the
liability scale, with the same row and column structure as h2array.
rgarray: A three-dimensional array of pairwise genetic correlation
estimates. The first two dimensions represent phenotype pairs
(rows and columns), and the third dimension indexes the jackknife blocks.
gcovarray: A three-dimensional array of pairwise genetic covariance
estimates, aligned in structure with rgarray.
Each element provides per-block estimates that can be used to compute standard errors or confidence intervals via the block jackknife method.
This function is used to generate samples based on sampling covariance matrix and rg matrix for target disease
generate_proposal_sample_changea_cor( Results_full_rg, Results_full_rg_array, plei_h2_idx, ratio_a )generate_proposal_sample_changea_cor( Results_full_rg, Results_full_rg_array, plei_h2_idx, ratio_a )
Results_full_rg |
genetic correlation matrix. |
Results_full_rg_array |
genetic correlation jackknife-block array. |
plei_h2_idx |
index of the target disease in the rg_mat. |
ratio_a |
corrected ratio. |
noisy_inversed_element for bias correction
data(Results_full_rg) data(Results_full_rg_array) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] plei_h2_idx<-1 ratio_a <- 0.75 generate_proposal_sample_changea_cor(Results_full_rg, Results_full_rg_array, plei_h2_idx, ratio_a)data(Results_full_rg) data(Results_full_rg_array) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] plei_h2_idx<-1 ratio_a <- 0.75 generate_proposal_sample_changea_cor(Results_full_rg, Results_full_rg_array, plei_h2_idx, ratio_a)
'h2_liability()' converts heritability estimates from the observed to liability scale.
h2_liability(h2, sample_prev, population_prev)h2_liability(h2, sample_prev, population_prev)
h2 |
(numeric) Estimate of observed-scale heritability |
sample_prev |
(numeric) Proportion of cases in the current sample |
population_prev |
(numeric) Population prevalence of trait |
(numeric) Liability-scale heritability
h2_liability(0.28, 0.1, 0.05)h2_liability(0.28, 0.1, 0.05)
Example h2 vector used in the vignette and examples.
h2_vectorh2_vector
A numeric matrix.
Internal simulation
Example h2 jk matrix used in the vignette and examples.
h2_vector_math2_vector_mat
A numeric matrix.
Internal simulation
'ldsc_h2()' uses ldscore regression to estimate the heritability of a trait from GWAS summary statistics and reference LD information.
ldsc_h2( munged_sumstats, sample_prev = NA, population_prev = NA, ld, wld, n_blocks = 200, chisq_max = NA, chr_filter = seq(1, 22, 1) )ldsc_h2( munged_sumstats, sample_prev = NA, population_prev = NA, ld, wld, n_blocks = 200, chisq_max = NA, chr_filter = seq(1, 22, 1) )
munged_sumstats |
Either a dataframe, or a path to a file containing munged summary statistics. Must contain at least columns named 'SNP' (rsid), 'A1' (effect allele), 'A2' (non-effect allele), 'N' (total sample size) and 'Z' (Z-score) |
sample_prev |
(numeric) For binary traits, this should be the prevalence of cases in the current sample, used for conversion from observed heritability to liability-scale heritability. The default is 'NA', which is appropriate for quantitative traits or estimating heritability on the observed scale. |
population_prev |
(numeric) For binary traits, this should be the population prevalence of the trait, used for conversion from observed heritability to liability-scale heritability. The default is 'NA', which is appropriate for quantitative traits or estimating heritability on the observed scale. |
ld |
(character) Path to directory containing ld score files, ending in '*.l2.ldscore.gz'. |
wld |
(character) Path to directory containing weight files. |
n_blocks |
(numeric) Number of blocks used to produce block jackknife standard errors. Default is '200' |
chisq_max |
(numeric) Maximum value of Z^2 for SNPs to be included in LD-score regression. Default is to set 'chisq_max' to the maximum of 80 and N*0.001. |
chr_filter |
(numeric vector) Chromosomes to include in analysis. Separating even/odd chromosomes may be useful for exploratory/confirmatory factor analysis. |
A [tibble][tibble::tibble-package] containing heritability information. If 'sample_prev' and 'population_prev' were provided, the heritability estimate will also be returned on the liability scale.
'ldsc_rg()' uses ldscore regression to estimate the pairwise genetic correlations between traits. The function relies on named lists of traits, sample prevalences, and population prevalences. The name of each trait should be consistent across each argument.
ldsc_rg( munged_sumstats, sample_prev = NA, population_prev = NA, ld, wld, n_blocks = 200, chisq_max = NA, chr_filter = seq(1, 22, 1) )ldsc_rg( munged_sumstats, sample_prev = NA, population_prev = NA, ld, wld, n_blocks = 200, chisq_max = NA, chr_filter = seq(1, 22, 1) )
munged_sumstats |
(list) A named list of dataframes, or paths to files containing munged summary statistics. Each set of munged summary statistics contain at least columns named 'SNP' (rsid), 'A1' (effect allele), 'A2' (non-effect allele), 'N' (total sample size) and 'Z' (Z-score) |
sample_prev |
(list) A named list containing the prevalence of cases in the current sample, used for conversion from observed heritability to liability-scale heritability. The default is 'NA', which is appropriate for quantitative traits or estimating heritability on the observed scale. |
population_prev |
(list) A named list containing the population prevalence of the trait, used for conversion from observed heritability to liability-scale heritability. The default is 'NA', which is appropriate for quantitative traits or estimating heritability on the observed scale. |
ld |
(character) Path to directory containing ld score files, ending in '*.l2.ldscore.gz'. |
wld |
(character) Path to directory containing weight files. |
n_blocks |
(numeric) Number of blocks used to produce block jackknife standard errors. Default is '200' |
chisq_max |
(numeric) Maximum value of Z^2 for SNPs to be included in LD-score regression. Default is to set 'chisq_max' to the maximum of 80 and N*0.001. |
chr_filter |
(numeric vector) Chromosomes to include in analysis. Separating even/odd chromosomes may be useful for exploratory/confirmatory factor analysis. |
This function estimates the pairwise genetic correlations between an arbitrary number of traits. The function also estimates heritability for each individual trait. There is a [ggplot2::autoplot()] method for visualizing a heatmap of the results.
This version handles cases where traits have non-positive heritability estimates more gracefully by returning NA values for correlations involving such traits.
A list of class 'ldscr_list' containing heritablilty and genetic correlation information - 'h2' = [tibble][tibble::tibble-package] containing heritability information for each trait. If 'sample_prev' and 'population_prev' were provided, the heritability estimates will also be returned on the liability scale. - 'rg' = [tibble][tibble::tibble-package] containing pairwise genetic correlations information. - 'raw' = A list of correlation/covariance matrices
'make_weights()' Internal Function to make weights
make_weights(chi1, L2, wLD, N, M.tot)make_weights(chi1, L2, wLD, N, M.tot)
chi1 |
chi-square |
L2 |
ld score |
wLD |
wld score |
N |
sample size |
M.tot |
Number of SNPs |
A numeric vector of initial LDSC weights for each SNP
'merge_sumstats()' Merging summary statistics with LD-score files
merge_sumstats(sumstats_df, w, x, chr_filter)merge_sumstats(sumstats_df, w, x, chr_filter)
sumstats_df |
dataframe of sumstat |
w |
wld score |
x |
ld score |
chr_filter |
(numeric vector) Chromosomes to include in analysis. Separating even/odd chromosomes may be useful for exploratory/confirmatory factor analysis. |
A tibble (data frame) containing the merged summary statistics and LD-score
'perform_analysis()' Internal function to perform LDSC heritability/covariance analysis
perform_analysis(n.blocks, n.snps, weighted.LD, weighted.chi, N.bar, m)perform_analysis(n.blocks, n.snps, weighted.LD, weighted.chi, N.bar, m)
n.blocks |
Number of blocks |
n.snps |
Number of SNPs |
weighted.LD |
wld score |
weighted.chi |
chi-square |
N.bar |
Average N after merging |
m |
Number of SNPs from LD data |
A list containing the results of the LDSC heritability/covariance analysis with the following elements:
reg.tot: Estimated total heritability or covariance (regression coefficient scaled by m).
tot.se: Standard error of the total heritability/covariance estimate, computed using a block jackknife.
intercept: LDSC regression intercept.
intercept.se: Standard error of the intercept, estimated via block jackknife.
pseudo.values: Vector of pseudo-values from the block jackknife procedure, one per block.
N.bar: Average sample size across SNPs after merging.
This function is used to compute pleioh2g after bias correction for target disease
pleiotropyh2_cor_computing_single( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array, sample_rep )pleiotropyh2_cor_computing_single( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array, sample_rep )
G |
index of target disease. |
phenotype |
Vector of the phenotype name |
h2_vector |
h2g vector for all traits - aligned as the order in phenotype file |
h2_vector_mat |
h2g array from jackknife-block estimates for all traits - aligned as the order in phenotype file |
Results_full_rg |
genetic correlation matrix. - aligned as the order in phenotype file |
Results_full_rg_array |
genetic correlation jackknife-block array. - aligned as the order in phenotype file |
sample_rep |
sampling times in bias correction |
A 'list' containing the following elements: - 'target_disease' (character): The value "401.1". - 'target_disease_h2_est' (numeric): target disease h2g. - 'target_disease_h2_se' (numeric): target disease h2g_se. - 'selected_auxD' (character): auxiliary diseases. - 'h2pleio_uncorr' (numeric): pre-correction pleiotropic heritability estimate. - 'h2pleio_uncorr_se' (numeric): pre-correction pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr' (numeric): pre-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_uncorr_se' (numeric): pre-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr_jackknife' (numeric): vector of all pre-correction percentage of pleiotropic heritability jackknife estimates. - 'h2pleio_corr' (numeric): post-correction pleiotropic heritability estimate. - 'h2pleio_corr_se' (numeric): post-correction pleiotropic heritability estimate s.e.. - 'percentage_h2pleio_corr' (numeric): post-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_corr_se' (numeric): post-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_corr_Z' (numeric): post-correction percentage of pleiotropic heritability estimate Z score. - 'corrected_weight' (numeric): corrected weight in bias correction.
G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") sample_rep<-20 post_corrrresults_prune<-pleiotropyh2_cor_computing_single(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array, sample_rep)G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") sample_rep<-20 post_corrrresults_prune<-pleiotropyh2_cor_computing_single(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array, sample_rep)
This function is used to compute pleioh2g after bias correction for target disease
pleiotropyh2_cor_computing_single_prune( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array, sample_rep )pleiotropyh2_cor_computing_single_prune( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array, sample_rep )
G |
index of target disease. |
phenotype |
Vector of the phenotype name |
h2_vector |
h2g vector for all traits - aligned as the order in phenotype file |
h2_vector_mat |
h2g array from jackknife-block estimates for all traits - aligned as the order in phenotype file |
Results_full_rg |
genetic correlation matrix. - aligned as the order in phenotype file |
Results_full_rg_array |
genetic correlation jackknife-block array. - aligned as the order in phenotype file |
sample_rep |
sampling times in bias correction |
A 'list' containing the following elements: - 'target_disease' (character): The value "401.1". - 'target_disease_h2_est' (numeric): target disease h2g. - 'target_disease_h2_se' (numeric): target disease h2g_se. - 'selected_auxD' (character): auxiliary diseases. - 'h2pleio_uncorr' (numeric): pre-correction pleiotropic heritability estimate. - 'h2pleio_uncorr_se' (numeric): pre-correction pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr' (numeric): pre-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_uncorr_se' (numeric): pre-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr_jackknife' (numeric): vector of all pre-correction percentage of pleiotropic heritability jackknife estimates. - 'h2pleio_corr' (numeric): post-correction pleiotropic heritability estimate. - 'h2pleio_corr_se' (numeric): post-correction pleiotropic heritability estimate s.e.. - 'percentage_h2pleio_corr' (numeric): post-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_corr_se' (numeric): post-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_corr_Z' (numeric): post-correction percentage of pleiotropic heritability estimate Z score. - 'corrected_weight' (numeric): corrected weight in bias correction.
G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") sample_rep<-20 post_corrrresults_prune<-pleiotropyh2_cor_computing_single_prune(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array, sample_rep)G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") sample_rep<-20 post_corrrresults_prune<-pleiotropyh2_cor_computing_single_prune(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array, sample_rep)
This function is used to compute pleioh2g after bias correction for target disease
pleiotropyh2_nocor_computing_single( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array )pleiotropyh2_nocor_computing_single( G, phenotype, h2_vector, h2_vector_mat, Results_full_rg, Results_full_rg_array )
G |
index of target disease. |
phenotype |
Vector of the phenotype name |
h2_vector |
h2g vector for all traits - aligned as the order in phenotype file |
h2_vector_mat |
h2g array from jackknife-block estimates for all traits - aligned as the order in phenotype file |
Results_full_rg |
genetic correlation matrix.- aligned as the order in phenotype file |
Results_full_rg_array |
genetic correlation jackknife-block array.- aligned as the order in phenotype file |
A 'list' containing the following elements: - 'target_disease' (character): The value "401.1". - 'target_disease_h2_est' (numeric): target disease h2g. - 'target_disease_h2_se' (numeric): target disease h2g_se. - 'selected_auxD' (character): auxiliary diseases. - 'h2pleio_uncorr' (numeric): pre-correction pleiotropic heritability estimate. - 'h2pleio_uncorr_se' (numeric): pre-correction pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr' (numeric): pre-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_uncorr_se' (numeric): pre-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_jackknife_uncorr' (numeric): vector of all pre-correction percentage of pleiotropic heritability jackknife estimates.
G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") h2pleiobeforecorr<-pleiotropyh2_nocor_computing_single(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array)G <- 1 data(Results_full_rg) data(Results_full_rg_array) data(h2_vector) data(h2_vector_mat) Results_full_rg<-Results_full_rg[1:15,1:15] Results_full_rg_array<-Results_full_rg_array[1:15,1:15,] h2_vector<-t(as.matrix(h2_vector[1,1:15])) h2_vector_mat<-h2_vector_mat[,1:15] phenotype<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") h2pleiobeforecorr<-pleiotropyh2_nocor_computing_single(G,phenotype,h2_vector, h2_vector_mat,Results_full_rg,Results_full_rg_array)
Prune disease selection
Prune_disease_selection_DTrgzscore( Target_disease, trait_name, Rg_mat, Rg_mat_z, rg_threshold )Prune_disease_selection_DTrgzscore( Target_disease, trait_name, Rg_mat, Rg_mat_z, rg_threshold )
Target_disease |
trait_name of target disease |
trait_name |
trait_name of pre-prune rg_matrix |
Rg_mat |
pre-prune rg_matrix |
Rg_mat_z |
pre-prune rg z matrix |
rg_threshold |
rg_threshold |
Rg_mat_leave
trait_name<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") data("Results_full_rg") data("Rg_mat_z") Results_full_rg<-Results_full_rg[1:15,1:15] Rg_mat_z<-Rg_mat_z[1:15,1:15] Target_disease<-'401.1' rg_threshold<-0.3 Rg_prune<-Prune_disease_selection_DTrgzscore(Target_disease, trait_name, Results_full_rg,Rg_mat_z,rg_threshold)trait_name<-c("401.1","244.5","318","735.3","411.4", "427.2","454.1","278.1","250.2","550.1","530.11", "296.22","519.8","562.1","763") data("Results_full_rg") data("Rg_mat_z") Results_full_rg<-Results_full_rg[1:15,1:15] Rg_mat_z<-Rg_mat_z[1:15,1:15] Target_disease<-'401.1' rg_threshold<-0.3 Rg_prune<-Prune_disease_selection_DTrgzscore(Target_disease, trait_name, Results_full_rg,Rg_mat_z,rg_threshold)
Perform pruning in computing pleioh2g and correct bias
pruning_pleioh2g_wrapper( G, phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL, n_block = 200, hmp3, sample_rep )pruning_pleioh2g_wrapper( G, phenotype, munged_sumstats, ld_path, wld_path, sample_prev = NULL, population_prev = NULL, n_block = 200, hmp3, sample_rep )
G |
index of target disease. |
phenotype |
Vector of the phenotype name |
munged_sumstats |
All LDSC-munged GWAS .stat.gz |
ld_path |
Path to directory containing ld score files. |
wld_path |
Path to directory containing weight files. |
sample_prev |
Vector of sample prevalence, in the same order of input GWAS summary statistics. |
population_prev |
Vector of population prevalence, in the same order of input GWAS summary statistics. |
n_block |
number of jackknife blocks. |
hmp3 |
Directory for hapmap 3 snplist. |
sample_rep |
sampling times in bias correction |
A 'list' containing the following elements: - 'target_disease' (character): The value "401.1". - 'target_disease_h2_est' (numeric): target disease h2g. - 'target_disease_h2_se' (numeric): target disease h2g_se. - 'selected_auxD' (character): auxiliary diseases. - 'h2pleio_uncorr' (numeric): pre-correction pleiotropic heritability estimate. - 'h2pleio_uncorr_se' (numeric): pre-correction pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr' (numeric): pre-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_uncorr_se' (numeric): pre-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_uncorr_jackknife' (numeric): vector of all pre-correction percentage of pleiotropic heritability jackknife estimates. - 'h2pleio_corr' (numeric): post-correction pleiotropic heritability estimate. - 'h2pleio_corr_se' (numeric): post-correction pleiotropic heritability estimate s.e.. - 'percentage_h2pleio_corr' (numeric): post-correction percentage of pleiotropic heritability estimate. - 'percentage_h2pleio_corr_se' (numeric): post-correction percentage of pleiotropic heritability jackknife s.e. estimate. - 'percentage_h2pleio_corr_Z' (numeric): post-correction percentage of pleiotropic heritability estimate Z score. - 'corrected_weight' (numeric): corrected weight in bias correction.
'read_ld()' Read ld from either internal or external file.
read_ld(ld)read_ld(ld)
ld |
(character) Path to directory containing ld score files, ending in '*.l2.ldscore.gz'. Default is 'NA', which will utilize the built-in ld score files from Pan-UK Biobank for the ancestry specified in 'ancestry'. |
A data frame (tibble) containing LD score information read from the specified directory. Each row corresponds to a SNP, and columns typically include:
CHR: Chromosome number.
SNP: SNP identifier (rsID).
BP: Base pair position.
L2: LD score value.
M: Number of SNPs used in the LD score computation.
'read_m()' Read M from either internal or external file
read_m(ld)read_m(ld)
ld |
(character) Path to directory containing ld score files, ending in '*.l2.ldscore.gz'. |
A data frame (tibble) containing SNP counts read from the specified M files.
'read_sumstats()' Read summary statistics from either internal or external file
read_sumstats(munged_sumstats, name)read_sumstats(munged_sumstats, name)
munged_sumstats |
Either a dataframe, or a path to a file containing munged summary statistics. Must contain at least columns named 'SNP' (rsid), 'A1' (effect allele), 'A2' (non-effect allele), 'N' (total sample size) and 'Z' (Z-score) |
name |
trait name |
A data frame (tibble) containing GWAS summary statistics for the specified trait. The returned object will always contain at least the following columns:
SNP: SNP identifier (rsID).
A1: Effect allele.
A2: Non-effect allele.
N: Total sample size for the SNP.
Z: Z-score of SNP-trait association.
'read_wld()' Read wld from either internal or external file
read_wld(wld)read_wld(wld)
wld |
(character) Path to directory containing weight files. Default is 'NA', which will utilize the built-in weight files from Pan-UK Biobank for the ancestry specified in 'ancestry'. |
A data frame (tibble) containing LD weight information read from the specified directory. Each row corresponds to a SNP, and columns typically include:
CHR: Chromosome number.
SNP: SNP identifier (rsID).
BP: Base pair position.
wLD: Weight for LD regression.
Example genetic correlation matrix used in the vignette and examples.
Results_full_rgResults_full_rg
A numeric matrix.
Internal simulation
Jackknife array of genetic correlations (62 traits)
Results_full_rg_arrayResults_full_rg_array
A 3-dim array.
Internal simulation
Example genetic correlation Z matrix used in the vignette and examples.
Rg_mat_zRg_mat_z
A numeric matrix.
Internal simulation
Example munged dataframe - refer to ldscr R package (https://github.com/mglev1n/ldscr)
sumstats_munged_example_input(example, dataframe = TRUE)sumstats_munged_example_input(example, dataframe = TRUE)
example |
(character) "401.1" which have been included as example traits. |
dataframe |
(logical) If 'TRUE' (default), return an example munged dataframe. If 'FALSE', return path to the file on disk. |
either a [tibble][tibble::tibble-package] containing a munged dataframe, or a path to the file on disk.