For the first case, use \w
does not serve, because this shortcut also considers digits and character _
, and how do you want the _
be one of the separators, so he can not be part of the word.
Assuming that it cannot have accented characters, one way to consider the "word" is [a-zA-Z]+
(the quantifier +
indicates "one or more occurrences").
It just takes one word. Then we have to have the "separator + word" sequence, and it should be repeated at least 4 times (so we have at least 5 words separated by the characters indicated).
For the tab, just use [- .,_]
. Then, just put the same definition of "word" after, and make this sequence repeat at least 4 times. Ie, ([- .,_][a-zA-Z]+){4,}
.
Put it all together, it’s [a-zA-Z]+([- .,_][a-zA-Z]+){4,}
.
Note that in this case I do not need to place the requirement of at least 8 characters. Because if you have at least 5 words (with at least 1 character) plus the 4 separators, you will already have more than 8 characters.
For the second case (check the required characters), we use lookaheads, that serve to see if something exists in front. For example, to see if there is at least one digit, we use (?=.*[0-9])
. The trick is that the lookehead It only checks if something exists, but then it goes back to where it was and keeps checking the rest of the regex. So we ensure that, after checking if it has a digit, it returns and checks the rest of the expression (ie if it has letter, etc).
For each type of required character we use one Lookahead, then the regex gets this monstrous thing:
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%<^&*?])[a-zA-Z0-9!@#$%<^&*?]{8,}$
Each Lookahead checks if a character type exists, and then regex checks that it has at least 8 of the specified characters.
I also used the markers ^
and $
, which indicate respectively the beginning and end of the string, so I guarantee that it only has what the regex indicates (not one more character, not the least).
Since regex can be one thing or another, we use |
to indicate that it can be one or the other. There is nothing "beautiful":
import re
r = re.compile(r'^(((?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%<^&*?])[a-zA-Z0-9!@#$%<^&*?]{8,})|([a-zA-Z]+([- .,_][a-zA-Z]+){4,}))$')
for senha in ['a.b.c.d.e', 'A.b-c @1_xyz', 'a.b.c', 'Abc123@!&']:
print(f'{senha} = {"válida" if r.match(senha) else "inválida"}')
The exit code above is:
a.b.c.d.e = válida
A.b-c @1_xyz = inválida
a.b.c = inválida
Abc123@!& = válida
Of course you can also use two separate regex and check if the password matches either:
def valida(senha):
return re.match(r'^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%<^&*?])[a-zA-Z0-9!@#$%<^&*?]{8,}$', senha) \
or re.match(r'^[a-zA-Z]+([- .,_][a-zA-Z]+){4,}$', senha)
for senha in ['a.b.c.d.e', 'A.b-c @1_xyz', 'a.b.c', 'Abc123@!&']:
print(f'{senha} = {"válida" if valida(senha) else "inválida"}')
If you want, you can also change the conditions of lookaheads by separate regex:
def valida(senha):
return (re.search(r'[0-9]', senha) and \
re.search(r'[a-z]', senha) and \
re.search(r'[A-Z]', senha) and \
re.search(r'[!@#$%<^&*?]', senha) and \
re.match(r'^[a-zA-Z0-9!@#$%<^&*?]{8,}$', senha)) \
or re.match(r'^[a-zA-Z]+([- .,_][a-zA-Z]+){4,}$', senha)
The first regex checks if it has a digit, the second checks if it has a lowercase letter, etc (I used search
for check at any position of the string, for match
only does the search from the beginning). The fifth checks if it has at least 8 of the characters indicated (here it does not matter to use match
or search
, since the ^
forces regex to search from the beginning of the string).
"at least characters" - I think the amount was missing, right? : -) Anyway, what is a "word"? If it is
@-123,abc $@,xy
, is a valid password? Because "words" would be "@", "123", "abc", "$@" and "xy" (and are separated by the characters indicated). Is that it? So your example password (test%#) is invalid, right?– hkotsubo
@hkotsubo : just edited -> at least 8 characters
– Paul Sigonoso
@hkotsubo For the first, an example of password validates: a-b-c-.d.e
– Paul Sigonoso
a-b-c-.d.e
does not seem valid to me, because you said that also must have at least one capital letter, one digit, and punctuation...– hkotsubo
@hkotsubo are 2 different types of passwords, ie 2 different regex are!
– Paul Sigonoso
But in the first case, do "words" only have letters? Or can they have other things?
– hkotsubo
"5 words, each separated by a hyphen..." and "must have at least 8 characters"" are mutually exclusive conditions. The minimum in the case would be 9 characters, 5 glyphs and 4 spacers.
– Augusto Vasques
@hkotsubo " word" = 1 or more letters
– Paul Sigonoso