Sunday, November 20, 2011

Matrix Performance in R

I've been working on an example of the new Graph Template Language from SAS.  As I don't have direct access to SAS 9.2, I've been developing via email with a friend that does.

In the meantime, I thought I would start to investigate some of the performance properties of R.  I work in the financial risk industry and I often find myself worrying about performance of linear algebra systems.  Often I find myself setting up a simulation environment and 2 things dominate the performance -- the Cholesky root of the covariance matrix and simple matrix multiplication.

The Cholesky root (L) of a matrix (∑) is a decomposition scheme where

∑ = L * L`

We can think of this as a loose analog to the square root of a matrix.  This is useful for simulation purposes because if

u ~ N(0,I(n))

where I(n) is the identity matrix of size n x n and ~ N( ) represents a multivariate normal distribution

If  ∑ = L * L` then
L * u ~ N( 0, ∑ )

So how fast can R calculate the Cholesky decomposition of ∑?  

First let's create a correlation matrix of size 4000.
n = 4000
corr = matrix(.9,n,n)
for(i in 1:n){

corr[i,i] = 1;

This creates a 4000x4000 correlation matrix with .9 correlation between all members.  So let's calculate the Cholesky root using the chol() function and see how long it takes.  Let's also see what gets returned.
start = Sys.time()
root = chol(corr)
end = Sys.time()
end - start
That gives us the return values
Time difference of 12.383 secs

       [,1]            [,2]            [,3]            [,4]               [,5]
[1,]    1 0.9000000 0.9000000 0.9000000 0.90000000
[2,]    0 0.4358899 0.2064742 0.2064742 0.20647416
[3,]    0 0.0000000 0.3838859 0.1233919 0.12339191
[4,]    0 0.0000000 0.0000000 0.3635146 0.08842247
[5,]    0 0.0000000 0.0000000 0.0000000 0.35259655
Unfortunately, this is the L` matrix, and we need the L matrix.  So we can transpose the result with the t() function.
start = Sys.time()
root = t(chol(corr))
end = Sys.time()
end - start
Time difference of 12.42 secs
So the transpose function is quick.  How many gigaflops is that?  Let's create a function for that.
gflops = function(ops,time){

fps = ops/time

 It takes approximately N^3 / 3 operations to calculate a Cholesky.  So...
[1] 1.71766 
I'm running on a fairly new Intel CORE i5 laptop.  That's pretty good. Note that we have to convert the time difference to a numeric value.

So for matrix multiplication, we will deal with square matrices.  For that reason, if A and B are square (n x n) matrices, then A*B has 2*n^3 - n^2 operations.  Let's create 2 random 4000 x 4000 matrices, multiply them together and see what the time and gflops are.
a = matrix(rnorm(n^2),n,n)
b = matrix(rnorm(n^2),n,n)
start = Sys.time()
xx = a %*% b
end = Sys.time()
end - start
gflops(2*4000^3 - 4000^2,as.numeric(end-start))
Time difference of 1.034383 mins

[1] 123.7298

That seems a tad high.  What does as.numeric() return if the time difference is in minutes?
as.numeric(end - start)
[1] 1.034383 
That's annoying.  If the time difference is > 1 minute, then you get the time in minutes, not seconds.  Really annoying.  Dividing 123.7298 / 60 = 2.0622.

Overall not too bad.  There is a package called Matrix that extends the base matrix functionality.  I haven't looked into that yet.  Next post I will explore that package and see if we can find ways to speed up these benchmarks.

No comments:

Post a Comment