If you want the lyrics o is the second letter of the string, you can use the bookmark ^, indicating the start of the string:
str_subset(string = x, regex(pattern = '^.o'))
So we have the beginning of the string (^), followed by any character (the dot, which means "any character, except line breaks"), followed by the letter o. The result in this case is:
[1] "horccaeon" "coleon" "volues" "mol" "tom"
Notice that the strings 'nao, 'auio' and 'aqoio' are left out because the lyrics o is not the second character (the string 'nada' is also not returned because it even has a o).
Already to check if the antepenultimate letter is a o, can use the bookmark $, indicating the end of the string:
str_subset(string = x, regex(pattern = 'o.{2}$'))
Now we have the words o, followed by two characters (.{2}), followed by the end of the string ($). The result is:
[1] "aqoio"
In general, you should use ^ if you want to check the letter o is the nth letter of the beginning, or $ if you want to check if she is X positions at the end. Ex:
^o - begins with o
^.{3}o - the fourth letter is o (because it has any 3 characters before)
o$ - ends with o
o.{3}$ - has the letter o, plus 3 characters, and the end of the string
Of course, if you want, you can exchange the point for something more specific (for example [a-z], so regex only considers the letter o if you have letters before or after - if you use the dot, you can have any character, including non-alphanumeric).
Just to explain their regex, none of them use markers ^ and $, which means that the pattern can be found in the middle of the string (thus, it does not guarantee that the letter o shall always be the second or antepenultimate, or any other specific position, see).
.o[^o] is any character, followed by o, followed by a character other than o. That means if you have something like 'zoo', she doesn’t take, after the o must have a character that is not o. Plus, it forces you to have something after the o, excluding two-letter words, such as 'do' (see).
.(?=o) is any character that has a o soon after, then in practice it is any string with at least two characters, in which the letter o has a character before (but not necessarily the letter o will be the second character, see).
regexonly with a lot of training. I know those operators but I couldn’t apply them. Thanks again.– neves
How would I use the
regex(pattern = '.(?=o)')if "o" was actually a vector ?– RxT