Scraping a series of URLs in R

I'm trying to batch download a series of URLs. So far my code is

link <- paste("http://portal.hud.gov/hudportal/documents/huddoc? id=RAD_PHAApp_", state, ".xls", sep = "")
state <- c('al','tx')
download.file(link, paste(destfile = 'Y:\\PBlack\\RAD\\', state, '.xls', sep = ""), mode = 'wb')

The idea here that I could add names to the state value and it would download and name them the state.

R returns the following when I run the code.

Warning messages:
1: In download.file(link, paste(destfile = "Y:\\PBlack\\RAD\\", state,  :
 only first element of 'url' argument used
2: In download.file(link, paste(destfile = "Y:\\PBlack\\RAD\\", state,  :
 only first element of 'destfile' argument used

Answers


As TARehman points out, you need to have separate calls to download.file for each file. It might be more intuitive for you to do this with a for loop.

Also, using paste0 will avoid using the sep="" every time.

states <- c('al', 'tx')

for(state in states) {
  link <- paste0("http://portal.hud.gov/hudportal/documents/huddoc?id=RAD_PHAApp_", state, ".xls")
  download.file(link,paste0("~/Desktop/",state,".xls"),mode='wb')
}

Though mapply might be somewhat faster.


You need to call the command to save the files more than once. Right now it will not work because the download.file() function downloads a single file, not a vector of files.

As so:

states <- c('al','tx')
links <- paste("http://portal.hud.gov/hudportal/documents/huddoc?id=RAD_PHAApp_", states, ".xls", sep = "")

func.download_files <- function(link,state) {

    download.file(link,paste("~/Desktop/",state,".xls",sep=""),mode='wb')
}

mapply(FUN = func.download_files,link=links,state=states)

Need Your Help

PHP: How to access the last element in a XML file

php xml arrays

I have an XML file that I want to be able to access the last element in the file. (NOTE: the file is dynamically updated and the latest values are always in the latest element:

How to retrieve which radio button within a LI is selected

c# html asp.net

&lt;ul class="ulGroup" runat="server" id="ulGroup"&gt;