It depends on the format these quotes may have. An option would be:
str_extract_all(line, "(?<=@)\\w+")
Returning:
[1] "REF1" "REF2" "REF3" "REF4" "REF5"
This regex uses lookbehind - the stretch between (?<= and ) - and serves to check if something exists before the current position. In this case, within the lookbehind only has the @.
The detail is that the @, for being in a lookbehind, will not be part of the match, then the regex will only return what is after it, which in case is \\w+. The shortcut \w means "letters, numbers or the character _", and the quantifier + means "one or more occurrences".
In the another answer it was suggested to use \\w*, but the * means "zero or more occurrences", which means that if you have a @ alone (no character corresponds to \w after), is returned a match emptiness. See the difference:
line <- 'Teste @ abc @REF1'
str_extract_all(line, "(?<=@)\\w+")
str_extract_all(line, "(?<=@)\\w*")
The first returns:
[1] "REF1"
And the second returns:
[1] "" "REF1"
If you want, you can be more specific (but then it will depend on the exact format of the quote). For example, if the format is always "3 uppercase letters and 1 digit", then you can use:
str_extract_all(line, "(?<=@)[A-Z]{3}[0-9]")
As no more details were given regarding the format, I leave only this suggestion, but ideally you be as specific as possible to avoid false positives.
For example, how \w also considers the character _, then the excerpt @___ is considered valid (see). But of course if you "know" that these cases do not occur with your strings, it is not so much problem to use \w. Everything depends on.
Thank you! To be more specific, I know that all citations are in the format @surname:
@Vieira2019. So I know that the last 4 digits will be numbers. I don’t know if this can help?!– Willian Vieira
@Willianvieira In this case it could be
"(?<=@)[A-Za-z]+[0-9]{4}"(end up with exactly 4 numbers, and with a varied amount of letters).– hkotsubo