-2
I am trying to make a Web Scraping of the Web Site Of Science, but esstou facing problems with scraping links from the site.
My intention is to scrape titles from the articles, links that direct to each page of the article within the Web of Science so that I can scrape other data such as: abstract, keywords, among others. And finally make a looping to scrape this information to the last page of the search.
I started with the following code:
library(rvest)
library(dplyr)
link <- paste0("https://apps.webofknowledge.com/Search.do?",
"product=WOS&SID=5Bzr6AeuFKanEWXAFWh&search_mode=GeneralSearch&",
"prID=45752f34-12a3-474b-8fcf-6b21a2196ed7")
page <- read_html(link)
titulo_artigo <- page %>%
html_nodes(".snowplow-full-record value") %>%
html_text()
links_dos_artigos <-page %>%
html_nodes(".snowplow-full-record value") %>%
html_attr("href")
Meanwhile, links_dos_artigos
return only NA values and not the links I need
I’d appreciate it if someone could help.
Welcome to Stackoverflow in Portugal! Dheynne, the link you passed is from an address under login, so you need to pass the parameters to access this page. I don’t know if it still works, but he tested the package
wosr
?– Daniel Ikenaga
I haven’t tested this wosr package yet, so I’ll take a look at it. In relation to the link really it has restricted access, here I can access through the internet of the Federal Educational Institution (free access). I will take a look at the package mentioned. Thank you very much, Daniel Kenaga!
– DHEYNNE ALVES