Extract table from a website for Rstudio

Asked

Viewed 376 times

4

1 answer

7


For that I usually use the package XML. Allows you to tell which table of the web page interests. In this case this page has several. The third has nothing of interest, so I extracted the numbers 1, 2, and 4.

library(XML)

URL <- "http://globoesporte.globo.com/futebol/brasileirao-serie-a/"

tabela1 <- readHTMLTable(URL, which = 1)
tabela1

tabela2 <- readHTMLTable(URL, which = 2)
tabela2

tabela4 <- readHTMLTable(URL, which = 4)
tabela4

Note that you can use the arguments of the base function R read.table, namely the argument stringsAsFactors can be important.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.