Command-line interface

Association analysis

Models the association between phenotypes and genotypes, accepting additional covariates and parameters to account for population structure and relatedness between samples. Users can choose between single-trait and multi-trait models, simple linear or linear mixed model set-ups.

usage: runAssociation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
                      [-g FILE_GENOTYPES] [--geno_delim GENOTYPES_DELIM] -o
                      OUTDIR (-st | -mt) (-lm | -lmm) [-n NAME] [-v]
                      [-k FILE_RELATEDNESS]
                      [--kinship_delim RELATEDNESS_DELIM] [-cg FILE_CG]
                      [--cg_delim CG_DELIM] [-cn FILE_CN]
                      [--cn_delim CN_DELIM] [-pcs FILE_PCS]
                      [--pcs_delim PCS_DELIM] [-c FILE_COVARIATES]
                      [--covariate_delim COVARIATE_DELIM]
                      [-adjustP {bonferroni,effective,None}]
                      [-nrpermutations NRPERMUTATIONS] [-fdr FDR] [-seed SEED]
                      [-tr {scale,gaussian}] [-reg] [-traitset TRAITSTRING]
                      [-nrpcs NRPCS]
                      [--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
                      [--plot] [-colourS COLOURS] [-colourNS COLOURNS]
                      [-alphaS ALPHAS] [-alphaNS ALPHANS] [-thr THR]
                      [--version]

Basic required arguments

-p, --file_pheno
 Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None
--pheno_delim Delimiter of phenotype file. g Default: “,”
-g, --file_geno
 Genotype file: either [S x N].csv file (first column: SNP id, first row: sample IDs) or plink formated genotypes (.bim/.fam/.bed). Default: None
--geno_delim Delimiter of genotype file (if not in plink format). Default: “,”
-o, --outdir Path [string] of output directory; user needs writing permission. Default: None
-st, --singletrait
 Set flag to conduct a single-trait association analysesDefault: False
-mt, --multitrait
 Set flag to conduct a multi-trait association analysesDefault: False
-lm, --lm Set flag to use a simple linear model for the associationanalysis
-lmm, --lmm Set flag to use a linear mixed model for the associationanalysis

Output arguments

-n, --name Name (used for output file naming). Default:
-v, --verbose [bool]: should analysis progress be displayed. Default: False

Optional files

-k, --file_kinship
 Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None
--kinship_delim
 Delimiter of kinship file. g Default: “,”
-cg, --file_cg Required for large phenotype sizes when –lmm/-lm; computed via runLiMMBo; specifies file name for genetic trait covariance matrix (rows: traits, columns: traits). Default: None
--cg_delim Delimiter of Cg file. g Default: “,”
-cn, --file_cn Required for large phenotype sizeswhen –lmm/-lm; computed via runLiMMBo; specifies file name for non-genetic trait covariance matrix (rows: traits, columns: traits). Default: None
--cn_delim Delimiter of Cn file. g Default: “,”
-pcs, --file_pcs
 Path to [N x PCs] file of principal components from genotypes to be included as covariates (first column: sample IDs, first row: PC IDs); Default: None
--pcs_delim Delimiter of PCs file. g Default: “,”
-c, --file_cov Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None
--covariate_delim
 Delimiter of covariates file. g Default: “,”

Optional association parameters

-adjustP, --adjustP
 

Possible choices: bonferroni, effective, None

Method to adjust single-trait p-values formultiple hypothesis testing when runningmultiple single-trait GWAS: bonferroni/effective number of tests `(Galwey,2009) <http://onlinelibrary.wiley.com/doi/10.1002/gepi.20408/abstract>`_Default: None

-nrpermutations, --nrpermutations
 Number of permutations for computing empirical p-values; 1/nrpermutations is maximum level of testing for significance. Default: None
-fdr, --fdr FDR threshold for computing empirical FDR. Default: None
-seed, --seed Seed [int] to inittiate random number generation for permutations. Default: 256

Optional data processing parameters

-tr, --transform_method
 

Possible choices: scale, gaussian

Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale”

-reg, --reg_covariates
 [bool]: should covariates be regressed out? Default: False

Optional subsetting options

-traitset, --traitset
 Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None
-nrpcs, --nrpcs
 First PCs to chose. Default: 10
--file_samplelist
 Path [string] to file with samplelist for sample selection, with one sample ID per line. Default: None
--samplelist Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None

Plot arguments

Arguments for depicting GWAS results as manhattan plot

--plot Set flag if results of association analysis should be depicted as manhattan and quantile-quantile plot
-colourS, --colourS
 Colour of significant points in manhattan plot
-colourNS, --colourNS
 Colour of non-significant points in manhattan plot
-alphaS, --alphaS
 Transparency of significant points in manhattan plot
-alphaNS, --alphaNS
 Transparency of non-significant points in manhattan plot
-thr, --threshold
 Significance threshold; when –fdr specified, empirical fdr used as threshold

Version

--version show program’s version number and exit

Variance decomposition

Estimates the genetic and non-genetic traitcovariance matrix parameters of a linear mixed model with random genetic and non-genetic effect via a bootstrapping-based approach.

usage: runVarianceEstimation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
                             [-k FILE_RELATEDNESS]
                             [--kinship_delim RELATEDNESS_DELIM] -o OUTDIR
                             [-c FILE_COVARIATES]
                             [--covariate_delim COVARIATE_DELIM] [-seed SEED]
                             -sp S [-r RUNS] [-t]
                             [--minCooccurrence MINCOOCCURRENCE]
                             [-i ITERATIONS] [-cpus CPUS]
                             [-tr {scale,gaussian}] [-reg]
                             [-traitset TRAITSTRING]
                             [--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
                             [-dontSaveIntermediate] [-v] [--version]

Basic required arguments

-p, --file_pheno
 Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None
--pheno_delim Delimiter of phenotype file. g Default: “,”
-k, --file_kinship
 Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None
--kinship_delim
 Delimiter of kinship file. g Default: “,”
-o, --outdir Path [string] of output directory; user needs writing permission. Default: None

Optional files

-c, --file_cov Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None
--covariate_delim
 Delimiter of covariates file. g Default: “,”

Bootstrapping parameters

-seed, --seed seed [int] used to generate bootstrap matrix. Default: 234
-sp, --smallp Size [int] of phenotype subsamples used for variance decomposition. Default: None
-r, --runs Total number [int] of bootstrap runs. Default: None
-t, --timing [bool]: should variance decomposition be timed. Default: False
--minCooccurrence
 Minimum count [int] of the pairwise sampling of any given trait pair. Default: 3
-i, --iterations
 Number [int] of iterations for variance decomposition attempts. Default: 10
-cpus, --cpus Number [int] of available CPUs for parallelisation of variance decomposition steps. Default: None

Optional data processing parameters

-tr, --transform_method
 

Possible choices: scale, gaussian

Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale”

-reg, --reg_covariates
 [bool]: should covariates be regressed out? Default: False

Optional subsetting options

-traitset, --traitset
 Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None
--file_samplelist
 Path [string] to file with samplelist for sample selection, with one sample ID per line.Default: None
--samplelist Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None

Output arguments

-dontSaveIntermediate, --dontSaveIntermediate
 Set to suppress saving intermediate variance components. Default: True
-v, --verbose [bool]: should analysis step description be printed. Default: False

Version

--version show program’s version number and exit