Extract table from a website for Rstudio

Question

Extract table from a website for Rstudio

Asked 7 years, 9 months ago

Viewed 376 times

4

Hello, I want to take the table of the Brazilian, for example this site "http://globoesporte.globo.com/futebol/brasileirao-serie-a/" and extract to a dateset in Rstudio, so that whenever the table updates as the games, it updates to rstudio as well. Someone can help me?

1 answer

Browser other questions tagged r rstudio

You are not signed in. Login or sign up in order to post.

by Rui Barradas • **15,422** points · Answer 1 · 2017-10-10T18:54:33+00:00

For that I usually use the package XML. Allows you to tell which table of the web page interests. In this case this page has several. The third has nothing of interest, so I extracted the numbers 1, 2, and 4.

library(XML)

URL <- "http://globoesporte.globo.com/futebol/brasileirao-serie-a/"

tabela1 <- readHTMLTable(URL, which = 1)
tabela1

tabela2 <- readHTMLTable(URL, which = 2)
tabela2

tabela4 <- readHTMLTable(URL, which = 4)
tabela4

Note that you can use the arguments of the base function R read.table, namely the argument stringsAsFactors can be important.