This response has a lot to do with scraping data from the web and using regular expressions. I’m no expert on the first, I use the second, but I believe I can help you.
To download the database, you need to use the function htmltab::htmltab
:
library(htmltab)
a <- htmltab("http://www.fundamentus.com.br/detalhes.php?papel=PETR4", which = 3)
The argument which = 3 indicates which table is to be downloaded, in this case the 3rd. From this, you can use the loops you need, and do a basic table edit to stay the way you need it:
b <- a[, c(3,4)]
a <- rbind(b, a[, c(5,6)])
names(a) <- c("Indicadores", "Valor")
a$Indicadores <- gsub("\\?", "", a$Indicadores)
a <- na.omit(a)
row.names(a) <- NULL
a
Indicadores Valor
1 P/L 13,72
2 P/VP 1,28
3 P/EBIT 3,58
4 PSR 1,01
5 P/Ativos 0,41
6 P/Cap. Giro 7,60
7 P/Ativ Circ Liq -0,82
8 Div. Yield 3,3%
9 EV / EBIT 6,30
10 Giro Ativos 0,41
11 Cres. Rec (5a) -0,5%
12 LPA 1,98
13 VPA 21,25
14 Marg. Bruta 35,6%
15 Marg. EBIT 28,2%
16 Marg. Líquida 7,6%
17 EBIT / Ativo 11,5%
18 ROIC 12,7%
19 ROE 9,3%
20 Liquidez Corr 1,48
21 Div Br/ Patrim 1,18
Finally, you would need to fix the Value column, because it has indexes that are in %, and others that apparently are not.
Excellent, congratulations, very helpful, thank you
– Henrique Faria de Oliveira
It is no longer working here. Now, when running htmltab, it gives the error: No encoding supplied: defaulting to UTF-8. Error: Couldn’t find the table. Try Passing (a Different) information to the which argument. In addition: Warning message: XML content does not Seem to be XML: ' Does anyone know how to get around this ? Obg,
– Rafael