Title: | Comprehensive and Easy to Use Quality Control of GWAS Results |
---|---|
Description: | When evaluating the results of a genome-wide association study (GWAS), it is important to perform a quality control to ensure that the results are valid, complete, correctly formatted, and, in case of meta-analysis, consistent with other studies that have applied the same analysis. This package was developed to facilitate and streamline this process and provide the user with a comprehensive report. |
Authors: | Alireza Ani [aut, cre], Peter J. van der Most [aut], Ahmad Vaez [aut], Ilja M. Nolte [aut] |
Maintainer: | Alireza Ani <[email protected]> |
License: | GPL-3 |
Version: | 1.7.1 |
Built: | 2024-12-03 06:41:04 UTC |
Source: | CRAN |
This function compares the key metrics of previously inspected files. This allows the user to check if the results of these studies are comparable (important when running a meta-analysis) and that there are no significant anomalies.
compare_GWASs(input.file.list, output.path)
compare_GWASs(input.file.list, output.path)
input.file.list |
list, full path of the Inspector result files. This file is in RDS format and will be generated for each GWAS result file during the inspection algorithm. |
output.path |
character, full path to the folder where output files should be saved. |
Key metrics report and plots of previously inspected files are generated and saved in the specified output folder.
This function runs the QC algorithm on a fabricated GWAS result file.
demo_inspector(result.dir)
demo_inspector(result.dir)
result.dir |
character. Path to the output folder for saving QC result files |
QC reports from running the algorithm on a sample GWAS file are generated and saved in the specified folder.
This templates should be edited and then used for setting up and running the QC pipeline. Default filename is config.ini.
get_config(dir.path)
get_config(dir.path)
dir.path |
Path to the folder for saving a sample configuration file. |
Copies a sample configuration file (config.ini) in the specified folder.
This template file is used to translate a dataset's column names (the header) into the standard names used by GWASinspector. The file contains a two-column table, with the left column containing the standard column-names and the right the alternatives. Both the standard and alternative columns must be fully capitalized. This is a text file which includes most common variable/header names and can be edited according to user specifications. The default filename is alt_headers.txt.
get_headerTranslation(dir.path)
get_headerTranslation(dir.path)
dir.path |
Path to the folder for saving a header-translation table file. |
Copies a sample header-translation table in the specified folder.
When evaluating the results of a genome-wide association study (GWAS), it is important to perform a quality control to ensure that the results are valid, complete, correctly formatted, and, in case of meta-analysis, consistent with other studies in the same analysis. This package was developed to facilitate and streamline this process and provide the user with a comprehensive report.
Check out our website for more help and support http://GWASinspector.com.
setup_inspector
This function Imports a QC-configuration file into R by generating a new instance of Inspector class.
run_inspector
This is the main function for running the algorithm on a set of GWAS result files.
result_inspector
This function displays a brief report about the results of running the Inspector algorithm on a set of GWAS result files.
demo_inspector
This function runs the algorithm on a fabricated GWAS result file. User should only set the output folder for saving the generated files. The input file and reference dataset are embedded in the package.
system_check
Checks if required and optional packages are installed on the system. Although the optional packages do not contribute to the QC itself, having them available will allow for Excel and HTML formatted report files, which are easier to read and interpret.
get_config
Copies the template configuration file to the local machine.
get_headerTranslation
Copies the template configuration file to the local machine.
compare_GWASs
Generates reports and plots for comparing the summary statistics of GWAS result files that are previously inspected with this package.
manhattan_plot
Generates the Manhattan plot from a GWAS result file. This function has many features that are described in the package tutorial.
GWASinspector uses the S4 object system of R to conduct the QC.
The QC is configured using an configuration (ini) file (check get_config
), which is imported into R through setup_inspector
and
turns into an object of the Inspector class. To perform the QC, process the object with run_inspector
.
A quick scan of the results can be performed via result_inspector
, but the primary outcome of the QC are the
report files and graphs generated by run_inspector
.
The main product of the QC is the extensive log file (in Excel/HTML format, depending on your settings)
Maintainer: Alireza Ani [email protected]
Authors:
Peter J. van der Most
Ahmad Vaez
Ilja M. Nolte
Useful links:
An object of this class is created by setup_inspector
function. Each section of the
configuration file is represented as a list of attributes in this object.
paths
A list of parameters which indicate Paths section from configuration file.
supplementaryFiles
A list of parameters which indicate supplementaryFiles section from configuration file.
input_parameters
A list of parameters which indicate input_parameters section from configuration file.
output_parameters
A list of parameters which indicate output_parameters section from configuration file.
remove_chromosomes
A list of parameters which indicate remove_chromosomes section from configuration file.
plot_specs
A list of parameters which indicate plot_specs section from configuration file.
filters
A list of parameters which indicate filters section from configuration file.
debug
A list of parameters which indicate debug section from configuration file.
input_files
A list of files that will be inspected during the run.
created_at
The time that object was created.
start_time
The time that object was run.
end_time
The time that run was finished.
StudyList
An object of StudyList class.
A function to generate Manhattan plots.
manhattan_plot( dataset, chr, pvalue, position, fileName, plot.title = "Manhattan Plot", plot.subtitle = "", p.threshold = 0.01, sig.threshold.log = -log10(5 * 10^-8), beta = NULL, std.error = NULL, check.columns = TRUE, useHQ = TRUE )
manhattan_plot( dataset, chr, pvalue, position, fileName, plot.title = "Manhattan Plot", plot.subtitle = "", p.threshold = 0.01, sig.threshold.log = -log10(5 * 10^-8), beta = NULL, std.error = NULL, check.columns = TRUE, useHQ = TRUE )
dataset |
Data frame or data table containing the below columns |
chr |
Name of chromosome column |
pvalue |
Name of P-value column |
position |
Name of position column |
fileName |
Full name and path of file to be saved (file extension should be 'png'). e.g. “c:/users/researcher/study/man_plot.png” |
plot.title |
Title of the plot, default value is 'Manhattan plot' |
plot.subtitle |
Subtitle of the plot |
p.threshold |
Threshold for plotting variants (i.e. p-values > 0.01 will not be plotted). Setting a higher threshold will significantly increase plotting time |
sig.threshold.log |
The -log10 transformed significance threshold, used for plotting a threshold line (e.g. 8 = 10^-8) |
beta |
(optional) Name of the effect-size column |
std.error |
(optional) Name of the standard error column |
check.columns |
Whether to check input columns for invalid values |
useHQ |
Whether to only plot HQ variants |
Generates and saves a Manhattan plot for the provided data.
This function displays a brief report about the results of running the Inspector algorithm on a set of GWAS result files. The full report including plots, cleaned files and summary statistics are generated and saved in the output folder during the algorithm run.
result_inspector(inspector)
result_inspector(inspector)
inspector |
An instance of Inspector class. Check |
A data.table containing a brief report about the results.
This is the main function of the package for running the QC algorithm on a set of GWAS result files.
It requires an object of class Inspector which should be created by setup_inspector
.
Check the package vignette and tutorial for more details on this topic.
run_inspector(inspector, verbose = TRUE, test.run = FALSE)
run_inspector(inspector, verbose = TRUE, test.run = FALSE)
inspector |
An instance of Inspector class. Check |
verbose |
logical. If FALSE, no messages will show up in the terminal and are only saved in the log file. |
test.run |
logical. If TRUE, only the first 1000 lines of each data file are loaded and analyzed; plots and saving the cleaned output dataset are skipped. Default value is FALSE. |
Reports from running the algorithm on a single or a series of GWAS result files are generated and saved.
To run a QC in GWASinspector, copy a template configuration file to your machine using the get_config
command at first, and edit it to suit your requirements.
Next, use the setup_inspector
function to check the configuration file and import it into R.
This will create an object of the inspector class, which can then be processed using run_inspector
.
setup_inspector(config.file, validate = TRUE)
setup_inspector(config.file, validate = TRUE)
config.file |
character. Path to a configuration (.ini) file. For a sample configuration file, see |
validate |
logical. Whether to validate the object. |
returns a new instance of Inspector class.
This class is embedded in the StudyList class and should not be initiated separately.
File
A list representing GWAS result file specifications
Counts
A list representing different variant counts from the GWAS result file.
Correlations
A list representing different allele frequency and P-value correlations in the GWAS result file.
Statistics
A list representing summary statistics from the GWAS result file.
Successful_run
A logical value indicating whether the run was successful or not.
starttime
The time that file inspection started.
endtime
The time that file inspection ended.
This class is embedded in the Inspector class and should not be initiated separately.
studyList
A list of GWAS study result files. Each member of this list is of class Study
.
studyCount
A numeric value indicating how many items of class Study
are included.
Checks if required and optional packages are installed on the system. Although the optional packages do not contribute to the QC itself, having them available will allow for Excel and HTML formatted log files, which are easier to read and interpret.
system_check()
system_check()
System information and required functionalities for the QC algorithm are checked and reported as a data frame.