Python: is not turning my regex’s Character class into negative

Asked

Viewed 182 times

7

I’m learning Regexes by the Automate the Boring Stuff w/ Python. In one of the titles of chapter 7, the book teaches about character classes. So far quiet. I have created Character classes for vowels (re.compile(r'[aeiouAEIOU]')), for letters and digits in ranges ((r'a-zA-Z0-9'))... All quiet.

When I started to learn about Negative Character classes, that is, define a Character class and make it be detected, in a text, for example, strings that DO NOT have the combination of characters that I define by Character class, I started to find difficulties.

A negative class Character declares itself so: re.compile(r'[ˆaeiouAEIOU]'), with the little hat in the front. But that’s not making Harvard negative: it’s actually detecting vowels and the little hat, if you have a little hat in your sentence.

Behold:

#Tentando (e conseguindo) detectar só VOGAIS...
>>> consonantRegex = re.compile(r'[aeiouAEIOU]')
>>> consonantRegex.findall('Robocop eats baby food. BABY FOOD')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

#Tentando detectar só CONSOANTES... (Perceba o chapeuzinho)
>>> consonantRegex = re.compile(r'[ˆaeiouAEIOU]')
>>> consonantRegex.findall('Robocop eats baby food. BABY FOOD')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'A', 'O', 'O']

#Colocando um chapeuzinho na frase -> Chapeuzinho detectado
>>> consonantRegex.findall('Robocop eats baby food. ˆBABY FOOD.')
['o', 'o', 'o', 'e', 'a', 'a', 'o', 'o', 'ˆ', 'A', 'O', 'O']

Dice:

  1. I’m using the Interactive shell of THONNY, but I have also tested in IDLE itself, same error.
  2. Mac Usage. We know that Mac has that annoying Feature that gets in the way of programming: the issue of accents. When I quote, for example, he waits for a vowel to see if it can be accentuated. Like air. Then you have to quote, type a consonant, like, s, there you stay "s normal, and then delete the s to type the desired vowel previously. (If anyone knows how to turn this off, but so you can still use accents, I also appreciate A LOT).

1 answer

8

Your 2nd datum answers your own question.

Macos actually works with modifiers, which actually uses other character types for accentuation, for example:
(Macos) ˆ (U+02C6) MODIFIER LETTER CIRCUMFLEX ACCENT. When the accent character is:
(Other)  ^ (U+005E) CIRCUMFLEX ACCENT.

To fix this problem on your keyboard, follow these[see edition] instructions.
If you want to test regex before making such changes, try using the following code:

consonantRegex = re.compile(r'[^aeiouAEIOU]')


Edition (28/09/2017):

To help other Stack people search for the question, here is the translation of the above instructions:

(en): You should go to the keyboard settings (Keyboard Preferences), and add a new keyboard.

Instead of using the USA International Keyboard [1] you should use the USA Keyboard [2].

Then of

Estadounidense internacional - PC

Use as alternative

Estadounidense

  1. When you’re adding a new keyboard, select English (or English, if you want to continue with Mac’s native keyboard).
  2. At the end of the list you will see the two keyboards. Use the option that nay be it internacional.

Example


[ ! ] Another option is to put the ˆ, before clicking any other key, press the space key.

  • 2

    I realized something like this when I saw the small circumflex.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.