Further Issues in Using OLS with Time Series Data (Chapter 11)

Again, before you download the data sets we are going to use make sure you are in the right working directory and activate the “foreign” package.

download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/nyse.dta','nyse.dta',mode='wb')
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/phillips.dta','phillips.dta',mode='wb')
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/fertil3.dta','fertil3.dta,mode='wb')
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/earns.dta','earns.dta',mode='wb')

Example 11.4
For this regression the lagged values of return are already contained in the data set. Thus, we do not have to calculated them ourselves and can run the regression in the usual manner.

nyse <- read.dta('nyse.dta')

lm.11.4 <- lm(return ~ return_1, data=nyse)
summary(lm.11.4)

Equation 11.17
To estimate this model we have to calculate the lagged values ourselves. First, I created a vector of the return values from the nyse data set. To create the series with the first lagged returns I omitted the first value in the list of returns of the nyse data set by adding [-1]. But since this causes the length of the resulting series to decrease by one observation, we have to add an NA so that R can estimate the model. I added this NA by creating a list with c() which contains the values from the first lag list “nyse$return[-1]” and put an NA at the end. For the second list of lagged values I proceeded similarly. I omitted the first and second observation from the return list of the nyse data set and added two NAs. The estimation works as usual.

return <- nyse$return
return1 <- c(nyse$return[-1],NA)
return2 <- c(nyse$return[-(1:2)],NA,NA)

lm.e11.17 <- lm(return ~ return1 + return2)
summary(lm.e11.17)
# Results differ from the book, despite the same number of observations.

Example 11.5
The estimations works as usual. The difference in the inflation rate is calculated within the lm() command.

phillips <- read.dta('phillips.dta')

lm.11.15 <- lm((inf-inf_1) ~ unem, data=phillips)
summary(lm.11.15)

# Natural rate of unemployment
lm.11.15$coeff[1]/-lm.11.15$coeff[2]

Example 11.6

fertil3 <- read.dta('fertil3.dta')

with(fertil3,cor(gfr,gfr_1,use='pairwise.complete.obs'))
with(fertil3,cor(pe,pe_1,use='pairwise.complete.obs'))

lm.11.16.1 <-lm(cgfr ~ cpe, data=fertil3)
summary(lm.11.16.1)

lm.11.16.2 <- lm(cgfr ~ cpe + cpe_1 + cpe_2, data=fertil3)
summary(lm.11.16.2)

# Joint significance of pe and pe_1
lm.11.16.2res <- lm(cgfr ~ cpe_2, data=fertil3)
summary(lm.11.16.2res)
anova(lm.11.16.2,lm.11.16.2res)

Example 11.7

earns <- read.dta('earns.dta')

lm.11.17.1 <- lm(lhrwage ~ loutphr + t, data=earns)
summary(lm.11.17.1)

# Detrend the variables
resy <- lm(lhrwage ~ t, data=earns)$resid
resx <- lm(loutphr ~ t, data=earns)$resid
summary(lm(resy ~ -1 + resx))

cor(resy,c(resy[-1],NA),use='pairwise.complete.obs')
cor(resx,c(resx[-1],NA),use='pairwise.complete.obs')

The command diff() calculates the difference between elements of a list. By default it assumes that the first difference between subsequent observations should be taken. For further methods to detrend data see this post.

lm.11.17.2 <- lm(diff(lhrwage) ~ diff(loutphr), data=earns)
summary(lm.11.17.2)

Example 11.8

fertil3 <- read.dta('fertil3.dta')

lm.11.18 <- lm(cgfr ~ cpe + cpe_1 + cpe_2 + cgfr_1, data=fertil3)
summary(lm.11.18)
# Significant coefficient on cgfr_1 suggests serial correlations in the errors

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s