How do I get R to repeat a request?

Asked

Viewed 160 times

2

I’m trying to extract data from the firefly from the R for this, I used the "firefly" package, and I took an example of code from here: https://brunaw.github.io/vagalume/vagalumeR.html

The problem: when trying to extract the lyrics of certain artist’s music I get an error, almost always, but not always, which makes me assume that the error is the result of an instability of the API. therefore, I would like to know how to have R repeat the request until the letter is actually extracted.

library(vagalumeR)
library(plyr)
library(tidyverse)
artist <- "portela"
song <- songNames(artist)
let <- ldply(map(song$song.id[1:62], lyrics, type = "id",
                 key = key), data.frame)

Error in if (cont$Mus[[1]]$lang > 1) { : argument is of length zero

1 answer

2


Try to get

The following solution is neither efficient nor elegant, but does what you asked.

lyrics2 <- function(x, type, key) {
  # Tenta pegar dados
  res <- try(lyrics(x, type = type, key = key))
  if (inherits(res, 'try-error')) { # verifica se houve erro
    cat('Um erro ocorreu com a música de id:', as.character(x), '\n')
    Sys.sleep(abs(rnorm(1, 2)))
    return(lyrics2(x, type, key)) # faz chamada recursiva
  }
  res # retorna resultado caso não haja erro
}

The idea of lyrics2 is basically try to call the function and, if an error occurs, call this function which checks the error (lyrics2) recursively. This way it will only stop calling when there is no more error in the request.

The function try() tries to execute the code inside it but not in case an error occurs. In this case it returns an object with the class 'Try-error' and the error message.

If an error has occurred, the function Sys.sleep() put the system to "sleep". The argument passed to her is the absolute value of a random number, which on average will be 2. The idea to do this is to be "kind" to the server.

It happens that the code is running infinitely (or at least for a long time) because there is some problem with the API in some Ids (such as '3ade68b8g2009b0b3').

let2 <- map(song$song.id, lyrics2, type = "id", key = key)

Eliminate failures

For this reason, another option may be to try to run the code once and eliminate the faults.

df_erro <- data.frame(
  id = 'erro', name = 'erro', song.id = 'erro',
  song = 'erro', language = NA_integer_, text = 'erro'
)
safe_lyrics <- safely(lyrics, otherwise = df_erro, FALSE)

The above code uses some of the package concepts purrr (already loaded on tidyverse).

The first of these is the idea of a secure function. The safely() returns the same function passed as first argument, but with one modification: it will now return a result with two lists. The first list is called result and contains the result if everything goes well and the second is called error and will contain the error message. The second argument passed is the result that should return in the error cases. It is useful to allow you to join the results later

The second is to use the function map_df('result') to map all lists returning from the first map to extract the result element from them and merge them into one data.frame.

let3 <- map(song$song.id, safe_lyrics, type = "id", key = key) %>% 
  map_df('result')

The above result returns a data.frame with 62 observations, one for each Portela song. Errors return with the content of df_erro and can be checked with which(let3$id == 'erro')

  • Thank you so much for the help! Anyway, the problem has not been completely solved as it is not directly related to a song error. An application of the safe_lyrics function in a group of artists returns a different number of errors with each request. Also, I need to argue that while you’ve been having problems with Portela’s song 4, I’m now having it with 61, and every day I run this function the problematic music changes. By requiring the lyrics of 658 songs, the API only returned 117 to me, the rest of the table being made up of errors.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.