Adventures in Statistical Computing: Portfolio Optimization in R, Part 1

I briefly mentioned in my last post; that I was fooling around with portfolio optimization in R. This post will the first in a series on the topic of portfolio optimization.

Please note, nothing I am about to say should be taken as advice for investing. These results are based on prior observed returns and the future rarely mimics the past. These techniques can give helpful insight on how you can better allocate a portfolio. It should not be used as the sole investment decision. Speak with a qualified professional if you are looking for advice.

While looking at the dividend adjusted returns on 3 government bond ETFs (TLT, IEF, and SHY), I noticed that the middle maturity bonds (IEF) had a better risk return profile than the long bonds (TLT). I showed the results in a tabular format. In this post we will revisit that analysis and show our results graphically. Here goes.

First, grab the return series for the ETFs using the function we previously created. Calculate the annualized return, standard deviation, and Sharpe Ratio

#Get Return Series for Short Medium and Long Term Gov Bonds

from = "2001-01-01"

to = "2011-12-16"

tlt = importSeries("tlt",from,to)

shy = importSeries("shy",from,to)

ief = importSeries("ief",from,to)

merged = merge(tlt,shy)

merged = merge(merged,ief)

vars = c("tlt.Return","shy.Return","ief.Return")

#Calculate Annualized Returns

t = table.AnnualizedReturns(merged[,vars],Rf=mean(merged[,"shy.Return"],na.rm=TRUE))

The results are what we saw before.

	tlt.Return	shy.Return	ief.Return
Annualized Return	0.0772	0.0283	0.0645
Annualized Std Dev	0.1404	0.0173	0.0740
Annualized Sharpe (Rf=2.81%)	0.3378	-0.0086	0.4729

If you listen to enough ~~Entertainment~~ Investment TV, you will eventually hear the term "barbell strategy." This refers to a portfolio allocation scheme where you buy the ends of a spectrum. All of the weight is on the ends and you can visualize it as a barbell. In a portfolio of government bonds, this would mean buying the long and short maturities and not holding the middle. So what kind of risk return profile would you see if you employed this strategy?

First, we will define risk as the portfolio variance. There all sorts of reasons not to use variance, but it is the oldest from back in the 50s when this type of analysis was brand new. We will define return as the expected return. In the table above, the Annualized Return is the expected return holding the asset for 1 year, and the square of the Std Dev is the risk.

If the portfolio is going to hold long and short term bonds, then we need to calculate the expected return and risk. Return is easy, it is the weighted average of the two returns, where the weights are the % of capital invested in each asset.

R_P = W_TLT * R_TLT + W_SHY* R_SHY

Where: W_TLT + W_SHY = 1

Now obviously the two assets are correlated (before Markowitz's doctoral dissertation in 1952, investment managers didn't understand correlation and implicitly assumed it was 1 -- Markowitz received the Nobel Prize for that bit of insight). If the returns are normally distributed then the variance of the portfolio will be

Vp = W_TLT²*σ²_TLT + W_SHY²* σ²_SHY + W_TLT * W_SHY * σ_TLT * σ_SHY *Corr_TLT,SHY

Where: W_TLT+ W_SHY= 1

With these two bits of knowledge, we can vary the weights and build the risk return profile for our barbell strategy.

#Check the correlations

corr = cor(merged[,vars],use='complete.obs')

c = corr['tlt.Return','shy.Return']

#Assuming a barbell strategy of long and short holdings, what is the risk return

#profile

ws = NULL

wt = NULL

mu = NULL

sigma = NULL

#50 observations

n=50

#Loop through the weights of the barbell

for (i in 0:n){

wsi = i/n;

wti = 1-wsi;

mui = wsi * rSHY + wti * rTLT

sigmai = wsi*wsi*sSHY*sSHY + wti*wti*sTLT*sTLT + wsi*wti*sSHY*sTLT*c

ws = c(ws,wsi)

wt = c(wt,wti)

mu = c(mu,mui)

sigma = c(sigma,sigmai)

}

#Risk Return Profile Data Frame

rrProfile = data.frame(ws=ws,wt=wt,mu=mu,sigma=sigma)

#Plot the profile

plot(rrProfile$sigma,

rrProfile$mu,

xlim=c(0,.022),

ylim=c(0,.08),

ylab="Expected Yearly Return",

xlab="Expected Yearly Variance",

main="Efficient Frontier for Government Bond Portfolios")

I’ll hold off on the graph and we will look at it once complete.

Notice that the equation above is quadratic. We can fit a parabola to the points we just created. Note, while it is customary to put the risk on the X axis, we fit variance (risk) as the dependant variable.

#Fit a quadratic function to the profile

fit = lm(rrProfile$sigma ~ rrProfile$mu + I(rrProfile$mu^2))

Next we generate the fitted line for plotting.

#get the coefficients

coe = fit$coefficients

#Get predicted risk values for each return

muf = NULL

sfit = NULL

for (i in seq(0,.08,by=.001)){

muf = c(muf,i)

s = coe[1] + coe[2]*i + coe[3]*i^2

sfit = c(sfit,s)

}

#plot the predicted frontier

lines(sfit,muf,col='red')

Based on the tabular data, we KNOW the IEF asset offers a better Sharpe Ratio (a risk adjusted return) than the TLT. If we want to consider all 3 assets in the portfolio, we can no longer just perturb weights. To do so would give us a cloud – what we really want is the outer surface of that cloud. The frontier where you maximize expected return for a given level of risk. Or (and this is the same thing), minimize risk for a given expected return.

The “portfolio.optim” function from the tseries package does the latter. We just need to feed in expected returns, and it will spit back out the optimal portfolio weights. We will vary the return from just over the minimum expected return (i.e. 100% in SHY) and just under the maximum (i.e. 100% in TLT). Note, “portfolio.optim” uses daily returns, so the code will have to handle that. We assume 255 trading days in a year.

#now let's add the 3rd asset. Unless we want to do a grid search, we need

#to optimize the portfolio, minimizing the risk for each level of return

#portfolio.optim cannot have NA values in the time series, filter them out

m2 = removeNA(merged[,vars])

wSHY = NULL

wIEF = NULL

wTLT = NULL

er = NULL

eStd = NULL

#loop through finding the optimum portfolio for return levels between

#the minimum (rSHY) and the max(rTLT)

#portfolio.optim uses daily returns, so we have to adjust accordingly

for (i in seq((rSHY+.001),(rTLT-.001),length.out=100)){

pm = 1+i

pm = log(pm)/255

opt = portfolio.optim(m2,pm=pm)

er = c(er,exp(pm*255)-1)

eStd = c(eStd,opt$ps*sqrt(255))

wTLT = c(wTLT,opt$pw[1])

wSHY = c(wSHY,opt$pw[2])

wIEF = c(wIEF,opt$pw[3])

}

#Plot the efficient frontier for the 3 assets.

lines(eStd^2,er,col='blue')

legend(.014,0.015,c("'Barbell' Strategy","All Assets"),

col=c("red","blue"),

lty=c(1,1))

solution = data.frame(wTLT,wSHY,wIEF,er,eStd)

Here is the graph.

The blue line of the efficient frontier for the 3 asset portfolio dominates that of the barbell strategy. That is, for each level of risk, the expected return is higher. Graphically, this shows that the addition of IEF into the portfolio made it better. Further, for what is nearly the maximum return we saw for the barbell strategy, the 3 asset portfolio can achieve it with half the risk.

Full code is available here.

Adventures in Statistical Computing

Saturday, December 17, 2011

Portfolio Optimization in R, Part 1

1 comment: