0
Good evening, I’m trying to pull data from google scholar with Rselenium but I’m having a hard time getting the information from the magazines I’m looking for.
Playing the code below:
#Primeiro construo o data frame que de revistas que quero puxar
teste <- c("Revista de Direito Administrativo", "ARSP. ARCHIV FUR RECHTS- UND SOZIALPHILOSOPHIE",
           "ANTITRUST BULLETIN")
Just after I run the function below:
get_journal <- function(teste) {
      remDr$navigate("https://scholar.google.com/citations?view_op=top_venues&hl=pt-BR&vq=en")
      final <- c()
      for(i in 1:length(teste)) {
          
            remDr$refresh()
          Sys.sleep(1)
          
           address_element <- remDr$findElement(using = "class", value = "gs_in_txt")
           
           address_element$sendKeysToElement(list(teste[i]))
           button_element <- remDr$findElement(using = "class", value = "gs_wr")
           
           button_element$clickElement()
           Sys.sleep(3)
           
           out <- remDr$findElement(using = "class", value = "gsc_mvt_n")
           output <- out$getElementText()
            
           final <- c(final, output)
             
         }
  
       return(final)
}
vector_out <- get_journal(teste)  
data.frame(teste, purrr::flatten_chr(vector_out)) %>%
  dplyr::mutate(., vector_out = stringr::str_remove_all(vector_out, "\\(|\\)")) %>%
  tidyr::separate(., vector_out, into = c("H5", "MedianaH5"), sep = ",")
But return me a list with NA (example below):
teste purrr..flatten_chr.vector_out. H5 MedianaH5 1 Revista de Direito Administrativo Índice h5 Índice h5 <NA> 2 ARSP. ARCHIV FUR RECHTS- UND SOZIALPHILOSOPHIE Índice h5 Índice h5 <NA> 3 ANTITRUST BULLETIN Índice h5 Índice h5 <NA> Warning message: Expected 2 pieces. Missing pieces filled with `NA` in 3 rows [1, 2, 3].
Anyone can help?