Friday, October 21, 2011

Learning R: Project 1, Part 1

I've decided to take the leap and learn R.  This is going to be painful.

I decided that doing mini projects will be easier than reading a book cover to cover, forgetting most of it, and deciding to just watch Fringe.

So here we go:

Project 1:
Take data from Yahoo! Finance and calculate the Beta of a stock (let's choose Google since they provide the platform for this blog).

Because I know SAS, I'm going to give the SAS version of the project first.  Then I'm going to go off, tinker, and hopefully get something decent in R with matching results.

Currently I only have access to SAS 9.1.3.  I would love to show the new Graph Template Language stuff, but until I upgrade, we're all stuck with GPlot.

Tasks:

  1. Download data directly from Yahoo! Finance for Google and SPY (proxy for the S&P500).
  2.  Calculate the daily and cumulative log returns of both stocks.
  3. Plot the cumulative returns to get a feel of co-movement.
  4. Calculate the correlation between the daily returns.  Plot the daily returns and %-tile ellipses.
  5. Run a simple linear regression Google = A + B*SPY + e.  Plot summary statistics on things like residuals


SAS Code:

/*http://ichart.finance.yahoo.com/table.csv?s=GOOG&a=00&b=1&c=2010&d=09&e=22&f=2011&g=d&ignore=.csv;*/

%macro download(symbol,start,end);
/* Simple macro to read historic stock data from Yahoo! */

/*Data _null_ Step for string and date parsing.  Build the URL string*/
data _null_;
format url $200.;
url="http://ichart.finance.yahoo.com/table.csv?s=";

start = "&start"d;
end = "&end"d;

/*Starting date parameters*/
a = put(month(start)-1,z2.);
b = put(day(start),2.);
c = put(year(start),4.);

/*Ending date parameters*/
d = put(month(end)-1,z2.);
e = put(day(end),2.);
f = put(year(end),4.);

/*Build the url*/
url = catt(url, upcase("&symbol"),
           '&a=',a,
           '&b=',b,
           '&c=',c,
           '&d=',d,
           '&e=',e,
           '&f=',f,
           '&g=d&ignore=.csv');

/*stick the url into a macro variable*/
call symput('url',compress(url));
run;

/*Fileref to point to the URL*/
filename in url "&url";

/*Use Proc Import to read the CSV
  Could use a data step, but I'm lazy*/

proc import file=in out=&symbol dbms=csv replace;
run;

filename in clear;


/*Yahoo! gives us data in the opposite order we need*/
proc sort data=&symbol;
by date;
run;

/*Pesky holidays mess up our weekday time series
 -- This step is not really necessary for this example
    but it makes using the SAS/ETS procedures later easier
 -- Take the default spline interpolation  */
proc expand data=&symbol out=&symbol to=weekday;
id date;
run;

/*Calculate the log returns daily and cumulative*/
data &symbol(drop=first);
set &symbol;
retain first;

if _n_ = 1 then
    first = adj_close;

&symbol = log(adj_close) - log(lag(adj_close));
&symbol._c = log(adj_close) - log(first);
run;

%mend;

/*Download Google*/
%download(goog, 01JAN2010, 22OCT2011);

/*Download SPY */
%download(spy,  01JAN2010, 22OCT2011);

/*Merge the two data sets*/
proc sql noprint;
create table merged as
select a.date,
       a.spy_c label="Cumulative SPY Return",
       a.spy,
       b.goog_c label="Cumulative Google Return",
       b.goog
    from spy as a,
         goog as b
    where a.date = b.date;

quit;

/*Output to an HTML file for publishing*/
ods html body="proj1.html" 
         path="c:\temp"
         gpath="c:\temp\images" (url="images/")
         ;

/*Turn on output graphs for all our procedures*/
ods graphics on;

/*Plot the close of each stock
  -- Thanks Enterprise Guide, without you I would have to learn SAS/Graph*/

/* Set the SAS/Graph options */
SYMBOL1
    INTERPOL=JOIN
    HEIGHT=10pt
    VALUE=NONE
    LINE=1
    WIDTH=2
    CV = RED
;

SYMBOL2
    INTERPOL=JOIN
    HEIGHT=10pt
    VALUE=NONE
    LINE=1
    WIDTH=2
    CV = BLUE
;

Legend1
    FRAME
    ;

Axis1
    STYLE=1
    WIDTH=1
    MINOR=NONE
;

Axis2
    STYLE=1
    WIDTH=1
    MINOR=NONE
;

TITLE;
TITLE1 "Google and SPY Cumulative Returns";

PROC GPLOT DATA = merged;
PLOT spy_c * Date goog_c * Date  /
 OVERLAY
    VAXIS=AXIS1
    HAXIS=AXIS2
FRAME    LEGEND=LEGEND1
;

RUN; QUIT;

goptions reset=all;

TITLE1 "Google and SPY Return Correlation";

/*Let's get the correlation and a scatter plot of the returns*/
proc corr data=merged plots=scatter(alpha=.1 .05 .01);
var spy goog;
run;

TITLE1 "Google and SPY Return Regression";

/*Finally let's do the regression and get the Beta*/
proc reg data=merged;
modelgoog = spy;
run;
quit;

ods html close;

Output is available here.

Off I go to figure out how to do this in R.  Pointers, questions and comments are welcome.

No comments:

Post a Comment