Sunday, December 25, 2011

Portfolio Optimization in R, a previous error.

I realized that I made a mistake in the calculation of the market weight portfolio from the previous post.  I hold constant the portfolio weights through time. These should be adjusted after each day as prices change.  The market portfolio requires no re-balancing.  What I have is, in essence, a re-balancing back to the original weights at the end of each day.

The equal weight portfolio is OK.  As we assume no transaction costs, it can be re-balanced daily.  In fact, the constant weight is the key assumption of this portfolio, so it SHOULD be re-balanced daily.

The optimal portfolio model also has the same issue.  We are implicitly re-balancing each day.  Our assumption in the optimization was holding for the entire period.

I will make the corrections and post results and updated code.  Being the holiday season, it might take a few days longer than normal.

Merry Christmas!

Friday, December 23, 2011

Portfolio Optimization in R, Part 4

This post will conclude the portfolio optimization series.  In this post, we will construct a trading strategy based on portfolio optimization and test the results against the CAPM market portfolio as well as another strategy.

It is worth reiterating:
Nothing I am about to say should be taken as advice for investing.  These results are based on prior observed returns and the future rarely mimics the past.  These techniques can give helpful insight on how you can better allocate a portfolio.  It should not be used as the sole investment decision.  Speak with a qualified professional if you are looking for advice.

Building on the work of Markowitz, Treynor, Sharpe, et. al. developed the Capital Asset Pricing Model -- CAPM.  They shared the Nobel Prize with Markowitz in 1990 for this work.  CAPM is a general equilibrium model.  The model assumes that market prices reflect all available information and give the "fair" value of a security.  Based on that, the market portfolio can be shown to be market capitalization weighted portfolio.  Market capitalization (or market cap) is defined as the share price times the number of shares outstanding -- the total value of the equity of a company.  A company's weight is its market cap divided by the total market cap of all securities.

Capitalization weighting has become the standard for indexes and index funds.  The S&P500 being what most people consider the standard "market portfolio."  We will test our portfolio optimization strategy against a capitalization weighted strategy.

Now there are problems with CAPM.  There are lots of ways to poke holes in it.  One way is to say that prices are not exactly fair value, but instead mean revert to fair value.  In this case, when a price is above fair value, the market cap weighted portfolio will overweight the overpriced security.  When it mean reverts, the cap weight portfolio will under perform because of the overweighting.  

This position was famously put forward by Robert Arnott.  I highly recommend his book,The Fundamental Index: A Better Way to Invest.  He argues that any portfolio strategy that breaks the correlation with price will outperform the capitalization index over time.  He created a new index which he puts forward in the book.  Another would be to simply equal weight each security (S&P actually publishes this index).  Because of that, we will also test our strategy against an equal weight strategy.

Here is our portfolio optimization strategy:

  1. At the beginning of each quarter, take the previous quarterly returns and calculate the market portfolio.
  2. Use this portfolio for the current quarter.
  3. At the start of the next quarter, go back to #1.
  4. Require at least 3 stocks in our portfolio.
  5. No shorting.
  6. Use 2% as the risk free rate.
  7. For the first quarter and if the optimization fails, use an equal weight portfolio.
This strategy will tend to outperform when prices are trending quarter over quarter.  I.E. if last quarter's returns and volatility accurately predict this quarter's values.  We are not taking trading costs into account.  The 2% rate is static, and we should technically use the 3 month treasury rate for each quarter beginning.  That's why this is just an illustration.

To start, we need to modify the marketPortfolio() function we previously created.  You can find it here.  The new function signature is
marketPortfolio = function(merged, rf, returnNames, weightNames, graph=FALSEpoints=500maxWeight=.334Debug=FALSE)
The improvements we have made are:
  1. Add a Debug option to print outputs if we request them
  2. Make the function fault tolerant.  Sometimes a feasible solution is not possible and the function needs to detect this and handle it accordingly.
  3. Cap the max return we search over to 50%.  This is a sanity check as large values can cause other weird behavior.
  4. Likewise, put a floor on the min return at .005%.
  5. If the max return is <0, then simply find the minimum variance portfolio that achieves that value.
  6. Add a maxWeight option that lets us cap the weight on any 1 security.
The universe of stocks we will consider is 28 of the Dow components (for some reason Yahoo! Finance only gave me 28 instead of 30).  I pre-downloaded the returns and stored them in a .rda file.  You can get it here.  I calculated the starting market cap weights of each stock and stored them in a .csv file.  Get it here.

#read the saved returns
returns = results
stocks = colnames(returns)

#get the capitalization weights
stockWgt = read.table("D:\\Workspace\\PortfolioOptimization\\stocks.csv",

#calculate the CapWeight portfolio returns
results = as.matrix(returns) %*% stockWgt
colnames(results) = c("CapWeight Portfolio")

results = as.timeSeries(results)

#calculate the EqualWeight portfolio Returns
ret = t(as.matrix(rep(1/length(stocks),length(stocks))))
resEqual = as.matrix(returns) %*% t(ret)
colnames(resEqual) = "EqualWeight Portfolio"

#combing the results into 1 timeSeries object
results = cbind(results,resEqual)

#grab the dates of the returns
dates = time(returns)
dates = dates@Data

#get the Quarters for the dates
qtrs = quarters(dates)
qtrs = cbind(dates,qtrs)

keep = NULL
lastQtr = "n"

#go through the dates and quaters and keep only the date with
#the first day of the quarter
for (i in seq(1,nrow(qtrs))){
      if (qtrs[i,2] == lastQtr){
            if (i == nrow(qtrs)){
                  keep = c(keep,1)
            } else {
                  keep = c(keep,0)
      }else {
            keep = c(keep,1)
      lastQtr = qtrs[i,2]

qtrs = cbind(qtrs,keep)

#get the indices of the first days of the quarter
indx = which(qtrs[,3] == 1)

#for the first quarter, use the equal weights
res = as.matrix(returns[indx[1]:(indx[2]-1),]) %*% t(ret)

#loop throug the quarters, calculate the market portfolio
#based on the previous quarter's data and then calculate
#the returns for the current quarter
for (i in seq(2,length(indx)-1)){
      print("Running ")
      #get subset of last quarter stock returns
      subset = returns[indx[i-1]:(indx[i]-1),]
      s = start(subset)
      e = end(subset)
      print("Fitting for:")
      #calculate market portfolio
      mp = marketPortfolio(subset,.02,stocks,stocks,
      #if optimization failed, use equal weights, else get
      #the weights from the market portfolio
      if (is.null(mp)){
            ret = t(as.matrix(rep(1/length(stocks),length(stocks))))
      } else {
            ret = as.matrix(mp[,stocks])
      #subset for the current quarter
      subRes = returns[indx[i]:(indx[i+1]-1),]
      s = start(subRes)
      e = end(subRes)
      print("Calculating Returns for:")
      #calculate current quarter returns for market
      #portfolio and add to return series
      subRes = as.matrix(subRes) %*% t(ret)
      res = rbind(res,subRes)

#loop does not calculate return for the last day in the series
subRes = returns[nrow(returns),]
subRes = as.matrix(subRes) %*% t(ret)
res = rbind(res,subRes)

#add the porfoliot optimized strategy to the others
colnames(res) = "Portfolio Optimization"
res = as.timeSeries(res)

results = cbind(results,res)

#calculate annualized return statistics

#calculate and chart the correlations

#calculate and chart the cumlative returns
            main="Total Returns CapWeight vs PortOpt",

Our Annualized Return Table:

CapWeight Portfolio
EqualWeight Portfolio
Portfolio Optimization
Annualized Return
Annualized Std Dev
Annualized Sharpe (Rf=0%)

Our portfolio optimization portfolio outperforms the Cap Weight, but under performs the Equal Weight.  Not surprising if you believe Mr. Arnott as we did not break the correlation with price.

Here is the chart of correlations:

We've created a portfolio that is correlated with the "market" cap weight portfolio, but not nearly so as the equal weight portfolio.  The degree of correlation of the equal weight and the cap weight is interesting.  

Here is the time series of returns charted:

Interestingly, our portfolio in green has trended sharply up since the market bottom in March of 2009.  It strongly outperformed the Cap Weight portfolio.

Wednesday, December 21, 2011

Portfolio Optimization in R, Part 3

In the last post, we discussed the issue of finding the market portfolio by fitting a curve to the found efficient frontier.  Because of kinks in the frontier, we could not guarantee that the fitted curve would be concave in the area of the market portfolio.  A different method was needed.

The method I am using, in this post, is to draw a line between the risk free asset and each point along the frontier.  From that line, calculate the difference between the line and the frontier.  The capital market line should have all frontiers values less than or equal to it.
CMLi <= EFi
The granularity with which we find the frontier means that we might not find the exact market portfolio.  To get around this, I relax the above constraint to find the
Portfolioj that max Count(CMLi < EFi)
I put together the following R function to do this.  Note, I have switched to using Standard Deviation as the risk measure.  This is the more conventional choice.

marketPortfolio = function(merged,rf,returnNames, weightNames,graph=FALSE){
      #create an empty data frame for the portfolio weights
      weights = data.frame(t(rep(NA,length(weightNames))))
      colnames(weights) = weightNames
      weights = weights[-1,]

      #Calculate Annualized Returns
      t = table.AnnualizedReturns(merged[,returnNames])
      #Range to optimize over
      maxRet = max(t['Annualized Return',]) - .005
      minRet = min(t['Annualized Return',]) + .005
      #portfolio.optim cannot have NA values in the time series, filter
      #them out
      m2 = removeNA(merged[,returnNames])
      er = NULL
      eStd = NULL
      #loop through finding the optimum portfolio for return 
      #levels between the range found above
      #portfolio.optim uses daily returns, so we have to 
      #adjust accordingly
      for (i in seq(minRet,maxRet,length.out=500)){
            pm = 1+i
            pm = log(pm)/255
            opt = portfolio.optim(m2,pm=pm)
            er = c(er,exp(pm*255)-1)
            eStd = c(eStd,opt$ps*sqrt(255))
            w = t(opt$pw)
            colnames(w) = weightNames
            weights = rbind(weights,w)
      solution = weights
      solution$er = er
      solution$eStd = eStd
      #find the index values for the minimum Std and the max Er
      minIdx = which(solution$eStd == min(solution$eStd))
      maxIdx = which(solution$er == max(solution$er))
      #subset the results
      subset = solution[minIdx:maxIdx,c("er","eStd")]
      subset$nAbove = NA
      #for each value in the subset, count the number of points
      #that lay below a line drawn through the point and the
      #RF asset
      for (i in seq(1,maxIdx-minIdx+1)){
            toFit = data.frame(er=rf,eStd=0)
            toFit = rbind(toFit,subset[i,c("er","eStd")])
            fit = lm(toFit$er ~ toFit$eStd)
            poly = polynomial(coef = fit$coefficients)
            toPred = subset
            colnames(toPred) = c("actEr","eStd")
            toPred$er = predict(poly,toPred[,"eStd"])
            toPred$diff = toPred$er - toPred$actEr
            subset[i,"nAbove"] = nrow(toPred[which(toPred$diff > 0),])
      #get the point of tangency -- where the number of points
      #below the line is maximized
      max = max(subset$nAbove)
      er = subset[which(subset$nAbove == max),"er"]
      eStd = subset[which(subset$nAbove == max),"eStd"]
      #index of the market portfolio
      idx = which(solution$er == er & solution$eStd == eStd)
      #Draw the line if requested
      if (graph){
            maxStd = max(solution$eStd) + .02
            maxRetg = max(solution$er) + .02
                        ylab="Expected Yearly Return",
                        xlab="Expected Yearly Std Dev",
                        main="Efficient Frontier",
            abline(v=c(0), col="black", lty="dotted")
            abline(h=c(0), col ="black", lty="dotted")
            toFit = data.frame(er=rf,eStd=0)
            toFit = rbind(toFit,solution[idx,c("er","eStd")])
            fit = lm(toFit$er ~ toFit$eStd)
      #Return the market portfolio weights and eStd and eR
      out = solution[idx,]
      return (out)

Let's test this out using a portfolio of Exxon Mobile (XOM), IBM (IBM), and the mid range Government Bond ETF (IEF).  This assumes you have the importSeries() function defined.


from = "2003-01-01"
to = "2011-12-16"
xom = importSeries("xom",from,to)
ibm = importSeries("ibm",from,to)
ief = importSeries("ief",from,to)

merged = merge(xom,ibm)
merged = merge(merged,ief)

vars = c("xom.Return","ibm.Return","ief.Return")
vars2 = c("xom","ibm","ief")

mp = marketPortfolio(merged,.02,vars,vars2,graph=TRUE)

The output in the log is:

        xom    ibm    ief      er    eStd
448 0.09395 0.1378 0.7682 0.07762 0.05996

The created graph is

This portfolio gives us an example of the optimization finding the lower edge of the frontier.  This is not an anomaly. This is why we subset the results to only include the parts from the apex of the frontier (min(StdDev)) and the maximum return.  Because we find the frontier from the min return to the max return, we guarantee order is preserved and only the upper part of the frontier is considered.

A more exact method would be to find the region of the frontier that contains the market portfolio and to either
  1. Grid search to find the market portfolio
  2. Fit a curve to that region and use the method discussed in the previous post.

If there is demand, I can create either of the above methods.  For demonstration purposes, what we have here should suffice.