Download a table from a website in data.frame format in R

Question

Download a table from a website in data.frame format in R

Asked 7 years, 2 months ago

Viewed 410 times

5

Good night

I need to download the table entitled Fundamentalist indicators at the following link

http://www.fundamentus.com.br/detalhes.php?papel=PETR4

I intend to download all the assets of the stock exchange through a looping using as a base a vector with the codes of all stocks

But I don’t know how to download the table via R in a data frame format

Thanks in advance

1 answer

Browser other questions tagged r

You are not signed in. Login or sign up in order to post.

by Guilherme Parreira • **2,060** points · Answer 1 · 2019-05-01T21:16:35+00:00

This response has a lot to do with scraping data from the web and using regular expressions. I’m no expert on the first, I use the second, but I believe I can help you.

To download the database, you need to use the function htmltab::htmltab:

library(htmltab)
a <- htmltab("http://www.fundamentus.com.br/detalhes.php?papel=PETR4", which = 3)

The argument which = 3 indicates which table is to be downloaded, in this case the 3rd. From this, you can use the loops you need, and do a basic table edit to stay the way you need it:

b <- a[, c(3,4)]
a <- rbind(b, a[, c(5,6)])
names(a) <- c("Indicadores", "Valor")
a$Indicadores <- gsub("\\?", "", a$Indicadores)
a <- na.omit(a)
row.names(a) <- NULL
a
       Indicadores Valor
1              P/L 13,72
2             P/VP  1,28
3           P/EBIT  3,58
4              PSR  1,01
5         P/Ativos  0,41
6      P/Cap. Giro  7,60
7  P/Ativ Circ Liq -0,82
8       Div. Yield  3,3%
9        EV / EBIT  6,30
10     Giro Ativos  0,41
11  Cres. Rec (5a) -0,5%
12             LPA  1,98
13             VPA 21,25
14     Marg. Bruta 35,6%
15      Marg. EBIT 28,2%
16   Marg. Líquida  7,6%
17    EBIT / Ativo 11,5%
18            ROIC 12,7%
19             ROE  9,3%
20   Liquidez Corr  1,48
21  Div Br/ Patrim  1,18

Finally, you would need to fix the Value column, because it has indexes that are in %, and others that apparently are not.