How to import excel data into R using column name and row name

I am a R novice and was wondering how to import excel data into R using row names and column names. Specifically i require a subset of the data in a number of worksheet within one excel file. Can i use row names and column names to identify and extract certain cells of data to R ?

Worksheet 1
----------
* X Y Z 
A 1 2 2
B 1 1 1
C 1 3 4
D 4 2 2
E 2 2 2 
----------
Worksheet 2
----------
*  X Y1 Z1 
A 1  2  2
B 1  2  3
C 1  3  4
D 4  1  1
E 2  1  1 

For example in the above spreadsheet how could i extract the data (2,2,2,2) using the row and column names (D,Y) (D,Z) (E,Y) (E,Z) in worksheet 1

how could i extract the data (1,1,1,1) using the row and column names (D,Y1) (D,Z1) (E,Y1) (E,Z1) in worksheet 2 ?

Thanks for any help provided

Barry

Answers


@Andrie mentionned the XLConnect package, it's a very useful package for I/O between R and Excel with the possibility to select region in Excel sheet.

I created an Excel file like yours in my Dropbox public folder, you can download the example.xls file here.

require(XLConnect)

## A5:C5 correspond to (D,Y) (D,Z) (E,Y) (E,Z)  in your example
selectworksheet1 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                               sheet = "Worksheet1", 
                               region = "A5:C5", header = FALSE)

selectworksheet1
##  Col0 Col1 Col2
## 1    2    2    2


## B4:C5 correspond to (D,Y1) (D,Z1) (E,Y1) (E,Z1) in the second example
selectworksheet2 <- readWorksheetFromFile("/home/ahmadou/Dropbox/Public/example.xls",
                         sheet = "Worksheet2", 
                         region = "B4:C5", header = FALSE)

selectworksheet2
##   Col0 Col1
## 1    1    1
## 2    1    1

unlist(selectworksheet2)
## Col01 Col02 Col11 Col12 
##    1     1     1     1 

There are several packages which provide functions to import Excel data to R; see the R data import/export documentation.

I've found the xlsx package to be useful (it will read both .xls and .xlsx files). I don't believe that it will accept row/column names as input, but it will accept their numerical value (row 1, column 4 for example). In your case, something like this should work, assuming that X, Y and Z correspond to columns 1-3:

library(xlsx)
# first example subset; call it ss1
# assume first row is not a header; otherwise requires header = T
ss1 <- read.xlsx("myfile.xlsx", sheetIndex = 1, rowIndex = 4:5, colIndex = 2:3)

# second example subset; call it ss2
# just the same except worksheet index = 2
ss2 <- read.xlsx("myfile.xlsx", sheetIndex = 2, rowIndex = 4:5, colIndex = 2:3)

However, you will need to experiment with your own file until things work as expected. You can also specify sheetName but I find sheetIndex normally works more reliably, once you figure out the correct index for each sheet. And take care if the first row is a header.

Having said all that: my preferred option would be to export the sheet to a text format such as CSV, use shell tools (cut, head, tail etc.) to get the required rows/columns and import that to R.


Need Your Help

Xcode only lets me run for an iOS device (no simulator)

xcode simulator

I just cloned a project into Xcode and am trying to run it in the simulator. However, my only option is iOS Device. This is unique to this project only and I don't know how to fix it. Anyone else run