Variance decomposition¶
LiMMBo¶
-
class
limmbo.core.vdbootstrap.
LiMMBo
(datainput, S, timing=False, iterations=10, verbose=False)[source]¶ -
combineBootstrap
(results)[source]¶ Combine the [S x S] subset covariance matrices to find the overall [P x P] covariance matrices Cg and Cn.
Parameters: results (list) – results of runBootstrapVarianceDecomposition() Returns: dictionary containing: - Cg_fit (numpy.array): [P x P] genetic covariance matrix via fitting
- Cn_fit (numpy.array): [P x P] noise covariance matrix via fitting
- Cg_average (numpy.array): [P x P] genetic covariance matrix via simple average
- Cn_average (numpy.array): [P x P] noise covariance matrix via simple average
- Cg_all_bs (numpy.array): [runs x S x S] genetic covariance matrices of runs phenotype subsets of size S
- Cn_all_bs (numpy.array): [runs x S x S] noise covariance matrices of runs phenotype subsets of size S
- proc_time_ind_bs (list): individual run times for all variance decomposition runs of [S x S] Cg and Cn
- proc_time_sum_ind_bs (list): sum of individual run times for all variance decomposition runs of [S x S] Cg and Cn
- proc_time_combine_bs (list): run time for finding [P x P] trait covariance estimates from fitting [S x S] bootstrap covariance estimates
- nr_of_bs (int): number of bootstrap runs
- nr_of_successful_bs (int): total number of successful bootstrapping runs i.e. variance decomposition converged
- results_fit_Cg (): results parameters of the bfgs-fit of the genetic covariance matrices (via scipy.optimize.fmin_l_bfgs_g)
- results_fit_Cn (): results parameters of the bfgs-fit of the noise covariance matrices (via scipy.optimize.fmin_l_bfgs_g)
Return type: (dictionary)
-
runBootstrapCovarianceEstimation
(seed, cpus, minCooccurrence=3, n=None)[source]¶ Distribute variance decomposition of subset matrices via pp
Parameters: - seed (int) – seed to initialise random number generator for bootstrapping
- minCooccurrence (int) – minimum number a trait pair should be sampled; once reached for all trait pairs, sampling is stopped if n is None; default=3
- n (int) – if not None, sets the total number of permutations, otherwise n determined by minCooccurrence; default: None
- cpus (int) – number of cpus available for covariance estimation
Returns: list containing variance components for all bootstrap runs
Return type: (list)
-
saveVarianceComponents
(resultsCombineBootstrap, output, intermediate=True)[source]¶ Save variance components as comma-separated files or python objects (via Cpickle).
Parameters: - resultsCombineBootstrap (dictionary) –
- Cg_fit (numpy.array): [P x P] genetic covariance matrix via fitting
- Cn_fit (numpy.array): [P x P] noise covariance matrix via fitting
- Cg_average (numpy.array): [P x P] genetic covariance matrix via simple average
- Cn_average (numpy.array): [P x P] noise covariance matrix via simple average
- Cg_all_bs (numpy.array): [runs x S x S] genetic covariance matrices of runs phenotype subsets of size S
- Cn_all_bs (numpy.array): [runs x S x S] noise covariance matrices of runs phenotype subsets of size S
- proc_time_ind_bs (list): individual run times for all variance decomposition runs of [S x S] Cg and Cn
- proc_time_sum_ind_bs (list): sum of individual run times for all variance decomposition runs of [S x S] Cg and Cn
- proc_time_combine_bs (list): run time for finding [P x P] trait covariance estimates from fitting [S x S] bootstrap covariance estimates
- nr_of_bs (int): number of bootstrap runs
- nr_of_successful_bs (int): total number of successful bootstrapping runs i.e. variance decomposition converged
- results_fit_Cg (): results parameters of the bfgs-fit of the genetic covariance matrices (via scipy.optimize.fmin_l_bfgs_g)
- results_fit_Cn (): results parameters of the bfgs-fit of the noise covariance matrices (via scipy.optimize.fmin_l_bfgs_g)
- output (string) – path/to/directory where variance components will be saved; needs writing permission
- intermediate (bool) – if set to True, intermediate variance components (average covariance matrices, bootstrap matrices and results parameter of BFGS fit) are saved
Returns: None
- resultsCombineBootstrap (dictionary) –
-
Standard REML¶
-
limmbo.core.vdsimple.
vd_reml
(datainput, iterations=10, verbose=True)[source]¶ Compute variance decomposition of phenotypes into genetic and noise covariance via standard REML: approach implemented in LIMIX
Parameters: - datainput (
InputData
) – object with ID-matched [N x P] phenotypes with [N] individuals and [P] phenotypes and [N x N] relatedness estimates - output (string, optional) – output directory with user-writing permissions; needed if caching is True
- cache (bool, optional) – should results be cached
- verbose (bool, optional) – should messages be printed to stdout
Returns: tuple containing:
- Cg (numpy.array): [P x P] genetic variance component
- Cn (numpy.array): [P x P] noise variance component
- process_time (double): cpu time of variance decomposition
Return type: (tuple)
Examples:
>>> import numpy >>> from numpy.random import RandomState >>> from numpy.linalg import cholesky as chol >>> from limmbo.core.vdsimple import vd_reml >>> from limmbo.io.input import InputData >>> random = RandomState(10) >>> N = 100 >>> S = 1000 >>> P = 5 >>> snps = (random.rand(N, S) < 0.2).astype(float) >>> kinship = numpy.dot(snps, snps.T) / float(10) >>> y = random.randn(N, P) >>> pheno = numpy.dot(chol(kinship), y) >>> pheno_ID = [ 'PID{}'.format(x+1) for x in range(P)] >>> samples = [ 'SID{}'.format(x+1) for x in range(N)] >>> datainput = InputData() >>> datainput.addPhenotypes(phenotypes = pheno, ... phenotype_ID = pheno_ID, ... pheno_samples = samples) >>> datainput.addRelatedness(relatedness = kinship, ... relatedness_samples = samples) >>> Cg, Cn, ptime = vd_reml(datainput, verbose=False) >>> Cg.shape (5, 5)
- datainput (