MVN density computation and random number generation

Let \(x\in\mathbb{R}^{M}\) be a realization of random variable \(X\sim\mathbf{MVN}\!\left(\mu,\Sigma\right)\), where \(\mu\in\mathbb{R}^{M}\) is a vector, \(\Sigma\in\mathbb{R}^{M\times M}\) is a positive-definite covariance matrix, and \(\Sigma^{-1}\in\mathbb{R}^{M\times M}\) is a positive-definite precision matrix.

The log probability density of \(x\) is

\[\begin{aligned} \log f(x)&=-\frac{1}{2}\left(M \log (2\pi) + \log|\Sigma| +z^\top z\right),\quad\text{where}~z^\top z=\left(x-\mu\right)^\top\Sigma^{-1}\left(x-\mu\right) \end{aligned}\]

The two computationally intensive steps in evaluating \(\log f(x)\) are computing \(\log|\Sigma|\), and \(z^\top z\), without explicitly inverting \(\Sigma\) or repeating mathematical operations. How one performs these steps efficiently in practice depends on whether the covariance matrix \(\Sigma\), or the precision matrix \(\Sigma^{-1}\) is available. For both cases, we start by finding a lower triangular matrix root: \(\Sigma=LL^\top\) or \(\Sigma^{-1}=\Lambda\Lambda^\top\). Since \(\Sigma\) and \(\Sigma^{-1}\) are positive definite, we will use the Cholesky decomposition, which is the unique matrix root with all positive elements on the diagonal.

With the Cholesky decomposition in hand, we compute the log determinant of \(\Sigma\) by adding the logs of the diagonal elements of the factors. \[\begin{aligned} \label{eq:logDet} \log|\Sigma|= \begin{cases} \phantom{-}2\sum_{m=1}^M\log L_{mm}&\text{ when $\Sigma$ is given}\\ -2\sum_{m=1}^M\log \Lambda_{mm}&\text{ when $\Sigma^{-1}$ is given} \end{cases}\end{aligned}\]

Having already computed the triangular matrix roots also speeds up the computation of \(z^\top z\). If \(\Sigma^{-1}\) is given, \(z=\Lambda^\top(x-\mu)\) can be computed efficiently as the product of an upper triangular matrix and a vector. When \(\Sigma\) is given, we find \(z\) by solving the lower triangular system \(Lz=x-\mu\). The subsequent \(z^\top z\) computation is trivially fast.

The algorithm for simulating \(X\sim\mathbf{MVN}\!\left(\mu,\Sigma\right)\) also depends on whether \(\Sigma\) or \(\Sigma^{-1}\) is given. As above, we start by computing the Cholesky decomposition of the given covariance or precision matrix. Define a random variable \(Z\sim\mathbf{MVN}\!\left(0,I_M\right)\), and generate a realization \(z\) as a vector of \(M\) samples from a standard normal distribution. If \(\Sigma\) is given, then evaluate \(x=Lz+\mu\). If \(\Sigma^{-1}\) is given, then solve for \(x\) in the triangular linear system \(\Lambda^\top\left(x-\mu\right)=z\). The resulting \(x\) is a sample from \(\mathbf{MVN}\!\left(\mu,\Sigma\right)\). We confirm the mean and covariance of \(X\) as follows: \[\begin{aligned} \mathbf{E}\!\left(X\right)&=\mathbf{E}\!\left(LZ+\mu\right)=\mathbf{E}\!\left(\Lambda^\top Z+\mu\right)=\mu\\ \mathbf{cov}\!\left(X\right)&= \mathbf{cov}\!\left(LZ+\mu\right)=\mathbf{E}\!\left(LZZ^\top L^\top\right)=LL^\top=\Sigma\\ \mathbf{cov}\!\left(X\right)&=\mathbf{cov}\!\left(\Lambda^{\top^{-1}}Z+\mu\right)=\mathbf{E}\!\left(\Lambda^{\top^{-1}}ZZ^\top\Lambda^{-1}\right) =\Lambda^{\top^{-1}}\Lambda^{-1}=(\Lambda\Lambda^\top)^{-1}=\Sigma \end{aligned}\]

These algorithms apply when the covariance/precision matrix is either sparse or dense. When the matrix is dense, the computational complexity is \(\mathcal{O}\!\left(M^3\right)\) for a Cholesky decomposition, and \(\mathcal{O}\!\left(M^2\right)\) for either solving the triangular linear system or multiplying a triangular matrix by another matrix (Golub and Van Loan 1996). Thus, the computational cost grows cubically with \(M\) before the decomposition step, and quadratically if the decomposition has already been completed. Additionally, the storage requirement for \(\Sigma\) (or \(\Sigma^{-1}\)) grows quadratically with \(M\).

References

Golub, Gene H, and Charles F Van Loan. 1996. Matrix Computations. 3rd ed. Johns Hopkins University Press.