Skip to contents

Rust-accelerated drop-in replacement for GenomicSEM. Same functions, same arguments, 2–100x faster.

Documentation

gsemr mirrors R GenomicSEM’s API, so the original project’s tutorials apply directly — for conceptual guides (model specification, common-factor GWAS, stratified enrichment, T-SEM) see the GenomicSEM wiki.

The gsemr documentation site documents what is specific to the Rust port:

  • Function reference — every exported function.
  • Compatibility — which R GenomicSEM parameters are implemented vs. accepted as stubs.
  • Architecture — why it’s faster and where its algorithms / numerical outputs differ from R GenomicSEM.

Install

# Requires Rust toolchain (https://rustup.rs)
remotes::install_github("PoHsuanLai/gsem", subdir = "bindings/r")

Usage

library(gsemr)

# LDSC
covstruc <- ldsc(
  traits = c("trait1.sumstats.gz", "trait2.sumstats.gz", "trait3.sumstats.gz"),
  sample.prev = c(NA, NA, NA),
  population.prev = c(NA, NA, NA),
  ld = "eur_w_ld_chr/",
  wld = "eur_w_ld_chr/",
  trait.names = c("V1", "V2", "V3")
)

# Common factor
cf <- commonfactor(covstruc, estimation = "DWLS")

# User-specified model
um <- usermodel(covstruc, model = "F1 =~ NA*V1 + V2 + V3\nF1 ~~ 1*F1\nV1 ~~ V1\nV2 ~~ V2\nV3 ~~ V3")

# Munge
munge(files = c("gwas1.txt", "gwas2.txt"), hm3 = "w_hm3.snplist", trait.names = c("T1", "T2"))

# Merge sumstats
sumstats(files = c("T1.sumstats.gz", "T2.sumstats.gz"), ref = "eur_w_ld_chr/", trait.names = c("V1", "V2"))

# GWAS
commonfactorGWAS(covstruc, SNPs = "merged_sumstats.tsv")
userGWAS(covstruc, SNPs = "merged_sumstats.tsv", model = "F1 =~ NA*V1 + V2\nF1 ~ SNP")

# Other
paLDSC(covstruc, r = 500)
rgmodel(covstruc)
write.model(loadings_matrix, covstruc$S)
hdl(traits, sample.prev, population.prev, LD.path = "hdl_panels/")

Functions

All 18 R GenomicSEM functions are implemented: ldsc, commonfactor, usermodel, munge, sumstats, commonfactorGWAS, userGWAS, paLDSC, write.model, rgmodel, hdl, s_ldsc, enrich, simLDSC, multiSNP, multiGene, summaryGLS, convert_hdl_panels

Note on commonfactorGWAS: gsemr’s commonfactorGWAS matches R GenomicSEM::userGWAS on the equivalent single-factor model, but does not numerically match R GenomicSEM::commonfactorGWAS, which uses a different internal parameterization. See the Architecture article for the full rationale. A one-time runtime warning is emitted on first use and can be suppressed via options(gsemr.commonfactorGWAS.quiet = TRUE).

Performance

Benchmarked against R GenomicSEM (3 traits, ~1.29M SNPs, N=50,000):

Function R GenomicSEM gsemr Speedup
ldsc 8.8s 3.2s 2.8x
commonfactor 115ms 11ms 10x
usermodel 91ms 11ms 8x
rgmodel 127ms 1ms 124x

Numerically identical results (S matrix max diff: 1.7e-5).

Citation

Grotzinger, A. D., Rhemtulla, M., de Vlaming, R., Ritchie, S. J., Mallard, T. T., Hill, W. D., Ip, H. F., Marioni, R. E., McIntosh, A. M., Deary, I. J., Koellinger, P. D., Harden, K. P., Nivard, M. G., & Tucker-Drob, E. M. (2019). Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nature human behaviour, 3(5), 513–525. https://doi.org/10.1038/s41562-019-0566-x

License

GPL-3.0