How to Download Portfolios and Factors from Fama and French Directly into R

This is some code to download and read data from the homepage of Kenneth R. French.

First, visit the data library of Kenneth R. French and look with which data sets you want to work with and copy their links. Then, set your working directory, download (download.file()) the relevant files and unzip unzip() them to get the .txt files. You can do all this with R once you know the link to the data set. In the example below, I downloaded monthly values for the three factor model and portfolio returns of portfolios that were formed based on firm size.



Read the data. Note that we have to use the read.delim() command, since the observations are not separated by any signs, but by empty space. Additionally, since the first lines of the file contain a description, we have to skip the first four rows in the factor sample and thirteen in the portfolio sample. You will have to inspect the .txt files with an ordinary editor to get an idea about how many rows you can skip. After those 4 (13) rows we tell R to only read 1065 further rows, since the table containing monthly observations ends here. (This number has to be updated every now and then as the sample increases over time. Thus, add an appropriate amount of rows to the value in “nrows” to get the most recent data. I used head() to see, when the sample starts and tail() to see if I really took the right amount of rows into consideration. Just try it out yourself. Add some values to “nrows=1065” to see the head of the section, where the area with annual data begins.)

After we read the data, we make sure that the sample contain the same periods by checking the number of observations in each set (just look in the upper right window, where the samples can be seen) and whether the variable t starts and end with the same numbers (use head() and tail() for this).

fffactors <- read.delim('F-F_Research_Data_Factors.txt',
col.names = c('t', 'mkt.rf', 'smb', 'hml', 'rf'),
sep = "",
nrows = 1067,
header = FALSE,
skip = 4,
stringsAsFactors = FALSE)



portfolio < -read.delim('Portfolios_Formed_on_ME.txt',
col.names = c("t", "smaller.0", "Lo.30", "Med.40", "Hi.30", "Lo.20", "Qnt.2", "Qnt.3", "Qnt.4", "Hi.20", "Lo.10", "Dec.2", "Dec.3", "Dec.4", "Dec.5", "Dec.6", "Dec.7", "Dec.8", "Dec.9", "Hi.10"),
sep = "",
nrows = 1067,
header = FALSE,
skip = 13,
stringsAsFactors = FALSE)


portfolio <- portfolio[, -1]

Next, we calculate the excess returns of the portfolios by subtracting the risk free rate from returns. Combining the resulting data frame with the factor sample results in the final data set. However, we should add time values, so that we can use commands for time series in R. For this purpose I use the knowledge about the date of the first and the last period of the sample and generate a monthly sequence of dates which I append to the sample.

portfolio.rf<-portfolio-fffactors$rf # Excess returns
sample<-cbind(portfolio.rf,fffactors) # Combine samples

dates<-seq(as.Date("1926-07-01"),as.Date("2015-05-01"),by="month") # Generate time stamp
sample<-cbind(dates,sample) # Combine time values with the sample

View(sample) # Take a look at the sample
save(sample,file="3factors_size_portfolios.RData") # Store sample for later use

One comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s