Command-line interface¶
Association analysis¶
Models the association between phenotypes and genotypes, accepting additional covariates and parameters to account for population structure and relatedness between samples. Users can choose between single-trait and multi-trait models, simple linear or linear mixed model set-ups.
usage: runAssociation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
[-g FILE_GENOTYPES] [--geno_delim GENOTYPES_DELIM] -o
OUTDIR (-st | -mt) (-lm | -lmm) [-n NAME] [-v]
[-k FILE_RELATEDNESS]
[--kinship_delim RELATEDNESS_DELIM] [-cg FILE_CG]
[--cg_delim CG_DELIM] [-cn FILE_CN]
[--cn_delim CN_DELIM] [-pcs FILE_PCS]
[--pcs_delim PCS_DELIM] [-c FILE_COVARIATES]
[--covariate_delim COVARIATE_DELIM]
[-adjustP {bonferroni,effective,None}]
[-nrpermutations NRPERMUTATIONS] [-fdr FDR] [-seed SEED]
[-tr {scale,gaussian}] [-reg] [-traitset TRAITSTRING]
[-nrpcs NRPCS]
[--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
[--plot] [-colourS COLOURS] [-colourNS COLOURNS]
[-alphaS ALPHAS] [-alphaNS ALPHANS] [-thr THR]
[--version]
Basic required arguments¶
| -p, --file_pheno | |
| Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None | |
| --pheno_delim | Delimiter of phenotype file. g Default: “,” |
| -g, --file_geno | |
| Genotype file: either [S x N].csv file (first column: SNP id, first row: sample IDs) or plink formated genotypes (.bim/.fam/.bed). Default: None | |
| --geno_delim | Delimiter of genotype file (if not in plink format). Default: “,” |
| -o, --outdir | Path [string] of output directory; user needs writing permission. Default: None |
| -st, --singletrait | |
| Set flag to conduct a single-trait association analysesDefault: False | |
| -mt, --multitrait | |
| Set flag to conduct a multi-trait association analysesDefault: False | |
| -lm, --lm | Set flag to use a simple linear model for the associationanalysis |
| -lmm, --lmm | Set flag to use a linear mixed model for the associationanalysis |
Output arguments¶
| -n, --name | Name (used for output file naming). Default: |
| -v, --verbose | [bool]: should analysis progress be displayed. Default: False |
Optional files¶
| -k, --file_kinship | |
| Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None | |
| --kinship_delim | |
| Delimiter of kinship file. g Default: “,” | |
| -cg, --file_cg | Required for large phenotype sizes when –lmm/-lm; computed via runLiMMBo; specifies file name for genetic trait covariance matrix (rows: traits, columns: traits). Default: None |
| --cg_delim | Delimiter of Cg file. g Default: “,” |
| -cn, --file_cn | Required for large phenotype sizeswhen –lmm/-lm; computed via runLiMMBo; specifies file name for non-genetic trait covariance matrix (rows: traits, columns: traits). Default: None |
| --cn_delim | Delimiter of Cn file. g Default: “,” |
| -pcs, --file_pcs | |
| Path to [N x PCs] file of principal components from genotypes to be included as covariates (first column: sample IDs, first row: PC IDs); Default: None | |
| --pcs_delim | Delimiter of PCs file. g Default: “,” |
| -c, --file_cov | Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None |
| --covariate_delim | |
| Delimiter of covariates file. g Default: “,” | |
Optional association parameters¶
| -adjustP, --adjustP | |
Possible choices: bonferroni, effective, None Method to adjust single-trait p-values formultiple hypothesis testing when runningmultiple single-trait GWAS: bonferroni/effective number of tests `(Galwey,2009) <http://onlinelibrary.wiley.com/doi/10.1002/gepi.20408/abstract>`_Default: None | |
| -nrpermutations, --nrpermutations | |
| Number of permutations for computing empirical p-values; 1/nrpermutations is maximum level of testing for significance. Default: None | |
| -fdr, --fdr | FDR threshold for computing empirical FDR. Default: None |
| -seed, --seed | Seed [int] to inittiate random number generation for permutations. Default: 256 |
Optional data processing parameters¶
| -tr, --transform_method | |
Possible choices: scale, gaussian Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale” | |
| -reg, --reg_covariates | |
| [bool]: should covariates be regressed out? Default: False | |
Optional subsetting options¶
| -traitset, --traitset | |
| Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None | |
| -nrpcs, --nrpcs | |
| First PCs to chose. Default: 10 | |
| --file_samplelist | |
| Path [string] to file with samplelist for sample selection, with one sample ID per line. Default: None | |
| --samplelist | Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None |
Plot arguments¶
Arguments for depicting GWAS results as manhattan plot
| --plot | Set flag if results of association analysis should be depicted as manhattan and quantile-quantile plot |
| -colourS, --colourS | |
| Colour of significant points in manhattan plot | |
| -colourNS, --colourNS | |
| Colour of non-significant points in manhattan plot | |
| -alphaS, --alphaS | |
| Transparency of significant points in manhattan plot | |
| -alphaNS, --alphaNS | |
| Transparency of non-significant points in manhattan plot | |
| -thr, --threshold | |
| Significance threshold; when –fdr specified, empirical fdr used as threshold | |
Version¶
| --version | show program’s version number and exit |
Variance decomposition¶
Estimates the genetic and non-genetic traitcovariance matrix parameters of a linear mixed model with random genetic and non-genetic effect via a bootstrapping-based approach.
usage: runVarianceEstimation [-h] [-p FILE_PHENO] [--pheno_delim PHENO_DELIM]
[-k FILE_RELATEDNESS]
[--kinship_delim RELATEDNESS_DELIM] -o OUTDIR
[-c FILE_COVARIATES]
[--covariate_delim COVARIATE_DELIM] [-seed SEED]
-sp S [-r RUNS] [-t]
[--minCooccurrence MINCOOCCURRENCE]
[-i ITERATIONS] [-cpus CPUS]
[-tr {scale,gaussian}] [-reg]
[-traitset TRAITSTRING]
[--file_samplelist FILE_SAMPLELIST | --samplelist SAMPLELIST]
[-dontSaveIntermediate] [-v] [--version]
Basic required arguments¶
| -p, --file_pheno | |
| Path [string] to [(N+1) x (P+1)] .csv file of [P] phenotypes with [N] samples (first column: sample IDs, first row: phenotype IDs). Default: None | |
| --pheno_delim | Delimiter of phenotype file. g Default: “,” |
| -k, --file_kinship | |
| Path [string] to [N x (N+1)] file of kinship/relatedness matrix with [N] samples (first row: sample IDs). Required when –lmm/-lm. Default: None | |
| --kinship_delim | |
| Delimiter of kinship file. g Default: “,” | |
| -o, --outdir | Path [string] of output directory; user needs writing permission. Default: None |
Optional files¶
| -c, --file_cov | Path [string] to [(N+1) x C] file of covariates matrix with [N] samples and [K] covariates (first column: sample IDs, first row: phenotype IDs). Default: None |
| --covariate_delim | |
| Delimiter of covariates file. g Default: “,” | |
Bootstrapping parameters¶
| -seed, --seed | seed [int] used to generate bootstrap matrix. Default: 234 |
| -sp, --smallp | Size [int] of phenotype subsamples used for variance decomposition. Default: None |
| -r, --runs | Total number [int] of bootstrap runs. Default: None |
| -t, --timing | [bool]: should variance decomposition be timed. Default: False |
| --minCooccurrence | |
| Minimum count [int] of the pairwise sampling of any given trait pair. Default: 3 | |
| -i, --iterations | |
| Number [int] of iterations for variance decomposition attempts. Default: 10 | |
| -cpus, --cpus | Number [int] of available CPUs for parallelisation of variance decomposition steps. Default: None |
Optional data processing parameters¶
| -tr, --transform_method | |
Possible choices: scale, gaussian Choose type [string] of data preprocessing: scale (mean center, divide by sd) or gaussian (inverse normalise). Default: “scale” | |
| -reg, --reg_covariates | |
| [bool]: should covariates be regressed out? Default: False | |
Optional subsetting options¶
| -traitset, --traitset | |
| Comma- (for list of traits) or hyphen- (for trait range) or comma and hyphen-separated list [string] of traits (trait columns) to choose; default: None (=all traits). Default: None | |
| --file_samplelist | |
| Path [string] to file with samplelist for sample selection, with one sample ID per line.Default: None | |
| --samplelist | Comma-separated list [string] of samples IDs to restrict analysis to, e.g. ID1,ID2,ID5,ID9,ID10. Default: None |
Output arguments¶
| -dontSaveIntermediate, --dontSaveIntermediate | |
| Set to suppress saving intermediate variance components. Default: True | |
| -v, --verbose | [bool]: should analysis step description be printed. Default: False |
Version¶
| --version | show program’s version number and exit |