How to download multiple urls in R?

Asked

Viewed 270 times

2

I need to extract several text files that are organized in annual folders at the following url: http://www.rsssfbrasil.com/tablesae/

How do I extract multiple folders at once? For example, if I want the 2003, 2004 and 2005 files, so I don’t have to write the same code three times.

Thank you!

  • Hello, at the address given in the question, I did not find any folder or directory, just several htm files. Usually you will download the files only once, and then work on them. What exactly do you want to do?

1 answer

2


There are several ways; a not very elegant:

library(RCurl)
library(XML)

base = 'http://www.rsssfbrasil.com/tablesae/'

page = url(base)

download.file(base, destfile='test.html')
page = htmlTreeParse('test.html', useInternal=TRUE, asTree=TRUE)
links = xpathSApply(page, "//a", xmlGetAttr, name='href')

# pega apenas os links com 'htm'
links = links[grep('htm', links)]

for(link in links) {

  download.file(paste(base, link, sep=""), destfile=link)

}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.