This document tracks which R GenomicSEM parameters are accepted by the Rust port, and which are accepted as stubs.
It applies to all three frontends:
-
gsemr— R package (library(gsemr)) -
genomicsem— Python package (import genomicsem) -
gsem— CLI (gsem userGWAS --...)
Parameter names match R GenomicSEM exactly in the R binding. The
Python binding uses snake_case equivalents (fix.measurement
→ fix_measurement, sample.prev →
sample_prev, etc.). The CLI uses --kebab-case
flags with the same meanings.
For the algorithmic and numerical differences
between the Rust port and R GenomicSEM — why
commonfactorGWAS uses a different parameterization, why SEs
differ, why optimizer behavior differs on Heywood cases — see ARCHITECTURE.md.
Behavioral notes (summary)
For the bulk of the pipeline (LDSC, S/V/I matrices, single-factor and
2-factor SEM, usermodel, userGWAS per-SNP
point estimates) the Rust port produces results identical to R
GenomicSEM within numerical tolerance (~1e-5 for S, ~1e-8 for V, ~1e-4
for SEM point estimates).
The known places where outputs differ from R:
-
commonfactorGWASper-SNP signs and magnitudes do not match R’scommonfactorGWASby default — they match R’suserGWAS. SeeARCHITECTURE.md §3.3. -
Standard errors are sandwich (robust) SEs, not
lavaan’s information-matrix SEs. See
ARCHITECTURE.md §3.2. -
SEM optimizer is L-BFGS, not nlminb; converges to
the same minimum on well-conditioned problems but can diverge on Heywood
cases. See
ARCHITECTURE.md §3.1. -
Heywood cases are allowed by default in both
packages; see
ARCHITECTURE.md §3.5.
Fully implemented parameters
All core parameters for every function work identically to R GenomicSEM, including:
- ldsc: traits, sample.prev, population.prev, ld, wld, trait.names, n.blocks, chr, stand, select, chisq.max, sep_weights, ldsc.log, parallel, cores
- commonfactor: covstruc, estimation
- usermodel: covstruc, estimation, model, std.lv, fix_resid, imp_cov, Q_Factor, toler, CFIcalc
- munge: files, hm3, trait.names, N, info.filter, maf.filter, column.names, overwrite, log.name
- sumstats: files, ref, trait.names, se.logit, OLS, linprob, N, info.filter, maf.filter, keep.indel, out, ambig, betas, direct.filter
- userGWAS: covstruc, SNPs, model, estimation, GC, sub, SNPSE, smooth_check, std.lv, fix_measurement, Q_SNP, printwarn, TWAS, parallel, cores
- commonfactorGWAS: covstruc, SNPs, estimation, GC, SNPSE, smooth_check, TWAS, identification, parallel, cores
- paLDSC: covstruc, r, p, diag, save.pdf, fa, fm, nfactors, parallel, cores
- write.model: Loadings, S_LD, cutoff, fix_resid, bifactor, mustload, common
- rgmodel: LDSCoutput, model, std.lv, estimation, sub
- hdl: traits, sample.prev, population.prev, LD.path, Nref, method
- s_ldsc: traits, sample.prev, population.prev, ld, wld, frq, trait.names, n.blocks, exclude_cont, ldsc.log
- enrich: s_baseline, s_annot, v_annot, model, params, fix, std.lv, toler, fixparam, tau, rm_flank
- simLDSC: covmat, N, ld, rPheno, int, N_overlap
- multiSNP: covstruc, model, beta, se, var_snp, ld_matrix, snp_names, SNPSE
- multiGene: covstruc, model, beta, se, var_gene, ld_matrix, gene_names, GeneSE, Genelist
- summaryGLS: covstruc, results
-
read_fusion: files, trait.names, binary, N, perm —
reads raw FUSION TWAS
.datoutput into the merged TWAS format thatmultiGene/userGWAS(TWAS=TRUE)consume (CLI:gsem read-fusion) -
subSV: subset
vech(S)and theVblock by 1-based vech positions, forTYPES/S_Stand (with-diagonal) and R (off-diagonal) numbering —gsem_matrix::vech::subset_sv. (R’s matrix-input path has an undefined-RMATRIXbug; the Rust port matches the bug-freeLDSC_OBJECTpath.) -
summaryGLSbands (numeric core): GLS fit +
confidence-band data (predictor grid, fitted line,
±
BAND_SIZE·SE envelope), withINTERCEPT/QUAD/CONTROLVARS/INTERVALS—gsem::stats::gls::summary_gls_bands. The ggplot rendering is not ported (Rust draws no plots); R additionally has a band-loop bug on theY/V_Yinput path that the port does not reproduce. -
cores/parallel: per-call thread budget for
ldsc,userGWAS,commonfactorGWAS, andpaLDSC. Each call builds its own local rayon pool, so concurrent calls do not share thread state andparallel=FALSEis fully scoped. Currently a no-op formunge,sumstats, andsimLDSC(their underlying implementations are serial).
Option-level R-equivalence coverage
Behaviour-changing options are validated against the real R package
one option at a time (fixtures in tests/fixtures/,
regenerated by tests/generate_*.R):
-
ldsc:
stand=TRUE(S_Stand/V_Stand),select="ODD",chisq.max, and liability-scale (sample/population.prev) each have a dedicated fixture + parity test (ldsc_stand,ldsc_select_odd,ldsc_chisqmax,ldsc_liability).stand/selectare binding-layer features, tested via the testthat parity suite; the rest are tested inr_validation_ldsc.rs. -
sumstats: all four standardization modes
(OLS/linprob/se.logit/none) plus
ambig=TRUEare numerically validated against R (sumstats_synth,sumstats_ambig).keep.indelanddirect.filterare implemented but cannot be exercised on the synthetic data (it contains no indels, and its allele frequencies are already in-range), so they are covered by the in-crate unit tests of their predicates rather than an end-to-end R fixture. -
userGWAS:
estimation="ML",GC={conserv,none},Q_SNP, andstd.lveach have a real-package fixture (gwas_options.json).fix_measurement=FALSEis intentionally untested — R’s free-measurement fit is computationally singular on the test subset (documented inr_validation_gwas_options.rs).
Functions not ported
These R GenomicSEM functions have no Rust
implementation and are not exposed by the CLI or the R/Python
bindings. They are outside the drop-in surface; call the original R
GenomicSEM package if you need them.
| R function | Status / reason |
|---|---|
addSNPs |
Deprecated in R GenomicSEM (superseded by
userGWAS/commonfactorGWAS). |
addGenes |
Deprecated in R GenomicSEM (superseded by the TWAS path in
userGWAS/multiGene). |
indexS |
Tiny vech-index lookup helper; the same column-major lower-triangle
indexing is available via gsem_matrix::vech. Not exposed
standalone. |
localSRMD |
Local structural-residual diagnostic; not ported. |
qtrait |
Quantitative-trait simulation helper; not ported (also removed from current upstream GenomicSEM). |
The remaining exceptions are the deprecated/diagnostic helpers above;
the behaviour-bearing TWAS on-ramp read_fusion
is ported (see below).
Not yet implemented
These parameters are accepted but have no effect. An informational message is printed when they are used.
MPI
-
userGWAS(MPI=TRUE)andcommonfactorGWAS(MPI=TRUE)— MPI distributed computing. Not applicable to the Rust backend, which uses shared-memory parallelism via rayon. For distributed workloads, split the sumstats file and run independentgsemCLI processes on each chunk, then concatenate the TSV outputs.
This page is generated from API_COMPAT.md
in the repository — edit
it there.