How to Load Data, Packages and Get Help

The script window contains a list of commands that you want R to execute. Click into the window and type in

3+5

and hit Ctrl+Enter. R does the calculation and the result is displayed in the console below. In order to execute the command you could also mark the whole line and click on “Run” on the upper bar of the script window. You could also mark the line and hit Ctrl+Enter and you will get the same result. I personally prefer the latter way to execute commands in R.

But since we as econometricians are not interested in elementary school level math, but in analysing data, we want to know how we can make R use these data. This is where the “Help” tab becomes relevant. Click on it, type “read.csv” into the search bar and hit enter. This is what help looks like in R.

Alternatively, you could have entered

?read.csv

into the script. The “?” sign tells R that you want to search for a specific command. If you use “??”, R will not only search for a command, but will also check the descriptions of the commands for the keywords. Try

??read

and see what happens. There are much more suggestions.

But let’s get back to importing data. Depending on the file format of your sample you will have to use the right command in R to read it. Although .csv files are very common we will focus on .dta files since this is the usual file format of data in Wooldridge (2013). (This is the format that is commonly used by the very popular software “Stata“.)

There is a read.csv() function. So, shouldn’t there be a read.dta() function too? Well, yes. But it is a bit more complicated. In the lower right window there is a tab called “Packages”. Click on it and you will see a list of (maybe strange sounding) titles. They refer to so-called packages which can be comprehended as a collection of functions that work in R. There are very many of them. But in order to keep R “light”, not all come with the initial installation file that you downloaded. But you can add them anytime when you need them.

A further feature of packages is that you have to activate them. For our purposes we need the package foreign. It allows us to make R read dta-files so that we can work with them. Since it should be already installed search for it in the list and check the box beside it. You will get a message in the console after that.

Alternatively, you could type

library(foreign)

into the script and execute the command. This is actually the way you should do it, because those scripts are meant to allow other people to see what you were doing.

Once you did this go to this page and download the file ceosal1.dta into a convenient folder. (You must mark the second point so that you download the dta- and not the des-file.) Safe the file in a convenient folder, where you easily find it again resp. where you would like to have all the stuff that you need for your project.

In order to make things easier we have to switch the working directory. This is done by clicking “Session – Set Working Directory – Choose Directory…” You should choose a folder that you have created for a project, e.g. “example_01”. This is (should be) also the folder into which you downloaded the ceosal1.dta file.

Again, there is a script version that you should use instead of clicking your way through which could look like that under Windows:

setwd('C:/Users/Name/R_Projects/example_01')

The function setwd() tells R where to set the new working directory. Note the quotation marks!

After downloading the data file into and switching the working directory to the right folder enter

ceosal1 <-  read.dta('ceosal1.dta')

That’s it. You just imported your first dataset into R. In the upper right window there should appear a line “ceosal1” and “209 obs. of 12 variables”. This ceosal1 <- thing might confuse you. It is the way R attributes names to certain objects in order to recognize them later. “ceosal1” is nothing else than the name which is then shown in the upper right window. <- is the command that tells R that what follows has to be saved under the name in front of it. Alternatively, you could also use a simple equality sign (“=”), but the “arrow” is more convenient.

A last small addition: If you already know the link to your data set, you can download it within R into your working directory. For this purpose you can use the download.file() command. All you have to type in is the URL of the file as a string and the name under which it should be stored in your directory. It proved to be useful to specify the mode of the file, at least with the data for Wooldridge (2013), since the error massage “a binary read error occurred” might show up. This can be prevented by specifying that the downloaded file is a binary file which is achieved by adding the option “mode=’wb’ ” into the download.file() function.

download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/ceosal1.dta','ceosal1.dta',mode='wb')

Summing up, your script should now look like this:

setwd('C:/Users/Name/R_Projects/example_01')
library(foreign)
download.file('http://fmwww.bc.edu/ec-p/data/wooldridge/ceosal1.dta','ceosal1.dta',mode='wb')
ceosal1 <- read.dta('ceosal1.dta')

The commands stated above set your working directory, activate all the relevant packages, download the data set into your working directory and read the data into R. Now you are fully prepared to start you analysis.

Still motivated and want to go to the next step? Click here.

Advertisements

6 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s