Reads multiple GWAS files, applies QC filters, aligns alleles, and outputs a merged tab-delimited file for multivariate GWAS.
Usage
sumstats(
files,
ref,
trait.names = NULL,
se.logit,
OLS = NULL,
linprob = NULL,
N = NULL,
betas = NULL,
info.filter = 0.6,
maf.filter = 0.01,
keep.indel = FALSE,
parallel = TRUE,
cores = NULL,
ambig = FALSE,
direct.filter = FALSE,
out = "merged_sumstats.tsv"
)Arguments
- files
Character vector of GWAS file paths
- ref
Path to reference panel file (e.g., w_hm3.snplist)
- trait.names
Character vector of trait names
- se.logit
Logical vector indicating which traits have logistic SEs
- OLS
Logical vector indicating which traits are from OLS regression
- linprob
Logical vector indicating which traits are linear probability
- N
Numeric vector or list of sample size overrides
- betas
Named list of beta column overrides per trait (default NULL = auto-detect)
- info.filter
INFO score filter (default 0.6)
- maf.filter
MAF filter (default 0.01)
- keep.indel
Keep indels (default FALSE)
- parallel
Use a parallel rayon worker pool to read the reference and GWAS files (default
TRUE). Each input file is decompressed and parsed on its own worker thread. Set toFALSEto force single-threaded execution.- cores
Integer cap on the rayon pool size. When
NULL(the default) rayon honoursRAYON_NUM_THREADSif set, else it uses the number of logical cores reported by the OS. Since the reads are parallelized across files, values abovelength(files) + 1don't help. On many-core machines (32+) or when the underlying BLAS is multithreaded, set this explicitly to avoid oversubscribing CPUs with nested BLAS threads.- ambig
Keep ambiguous SNPs (default FALSE)
- direct.filter
Apply MAF filter directly to GWAS file frequencies (default FALSE)
- out
Output file path for the merged sumstats TSV (default "merged_sumstats.tsv")