From what I understand, you want to go removing the invalid characters as these are typed. For example, if after the "W" you can only have digits, and the user type an "x", the string becomes "Wx", and in this case the "x" must be removed immediately after being typed, and the value of the input
back to just "W".
Well, doing this with regex will be quite complicated as you have to evaluate all the possibilities for the replacement to be done properly:
- when the first character is typed, you have to check if it is "W"
- when the second character is typed (assuming the first one has already been checked to be "W"), you have to check if it is a digit
- when the third character is typed (assuming the first 2 have already been checked), you have to check if it is another digit
- and so on...
Just so the regex isn’t so confused, let’s assume the criterion is "W followed by 3 digits, followed by a letter". A possible solution would be:
function formatarRG(input, teclapres) {
let numero = input.value.trim().toUpperCase();
let r = /^W(\d(\d(\d[A-Z]?)?)?)?$/;
while (! numero.match(r)) { // enquanto não estiver no formato correto, vai removendo caracteres do input
numero = numero.slice(0, -1);
// se já removeu tudo, pode sair do loop
if (numero.length == 0) break;
}
input.value = numero;
}
<html>
<label>Número RG:</label>
<input type="text" value="" maxlength="14" size="20" onKeyUp="formatarRG(this, event);" />
<html/>
The regex is ^W(\d(\d(\d[A-Z]?)?)?)?$
. She uses the markers ^
and $
, that indicate the beginning and end of the string, so I guarantee that it can only have what is specified in the expression. Then we have the W
. And then we have the tricky stretch, which checks all possibilities.
Basically, starting from the inside out:
\d[A-Z]?
: a digit optionally followed by a letter (the ?
indicates that something is optional)
\d(\d[A-Z]?)?
: a digit optionally followed by "a digit optionally followed by a letter"
\d(\d(\d[A-Z]?)?)?
: a digit optionally followed by the expression above
- finally, the above expression is also optional
Thus, regex can check whether it has only the letter "W", or "W" followed by only one digit, or 2 digits, or 3 digits, or 3 digits plus one letter.
Could anyone suggest using ^W\d{0,3}[A-Z]?$
(the letter "W", followed by 0 to 3 digits, followed by an optional letter), but this regex does not serve, as it also takes cases such as "W1X" and "W12X" (with 1 or 2 digits before the last letter). Only the above regex ensures that it has 1, 2 or 3 digits, and that the last letter only occurs after the third digit. See the difference here and here.
Anyway, if something is typed that does not fit, the characters of the end are removed, until a valid string is reached.
Now try to imagine what the expressions would look like for your criteria. With 3 digits and a letter already was this business - in my opinion - confusing and difficult to maintain. It would be something like ^W(\d(\d(\d(\d(\d(\d(\d[A-Z\d]?)?)?)?)?)?)?)?$
- or checked to see if it’s right. For RNE, it would be something like ^R(N(E([A-Z\d](\d(\d(\d(\d(\d(\d[A-Z\d]?)?)?)?)?)?)?)?)?)?$
. Both are difficult to understand and can become maintenance nightmares.
Use regex, but otherwise
I still find it easier for you to indicate in your interface which is the correct format, and if the user type something wrong, show an informative message:
const campo = document.querySelector('#rg');
campo.addEventListener('input', () => {
campo.setCustomValidity('');
campo.checkValidity();
});
campo.addEventListener('invalid', () => {
campo.setCustomValidity('O formato do campo é blablablaetc (informe o formato correto nesta mensagem)');
});
/* deixar borda vermelha enquanto o campo for inválido */
input:invalid {
border: red 1px solid;
}
<form>
<label>Número RG:</label>
<input id="rg" type="text" value="" required
pattern="^(W\d{7}[A-Z\d]|RNE[A-Z\d]\d{6}[A-Z\d])$" size="20" />
<input type="submit" value="ok">
<form/>
The attribute pattern
has a regex indicating the format the field should have: it can be a RG (starting with "W") or a RNE (the character |
indicates alternation: an option or other). If the entered value is wrong (i.e., it does not match regex), the CSS rule input:invalid
is applied (and then you can style the field the way you think best, for example, to indicate that the format is wrong).
This solution does not prevent the user from entering something invalid, nor does it automatically correct the value. But when trying to submit the form, the corresponding message, defined by setCustomValidity
.
Anyway, once you know whether what was typed is right or not, you create your way to inform the user what is wrong and how to fix it.
A tip:
{1}
is redundant and can be removed.\d{1}
is the same as\d
(by and large(qualquercoisa){1}
is the same as(qualquercoisa)
). In addition, the\w
already includes digits, so if you have\w
doesn’t need\d
. Only that the\w
also considers the character_
, so you should actually use[A-Z\d]
(letter from A to Z or digit from 0 to 9). Another detail is that thereplace
is not making much sense, because you replace everything you found by the same things, and in the end the string will be the same as it was before– hkotsubo
Thank you very much for the guidelines, I’ll make the adjustments. The idea is to have a return with the really valid characters, already disregarding what is not part of the rule, so that I used the 'replace', you would have something to indicate ?
– Dennys
But to despise what is not part may be too broad. If "Wxyz.2#@0842241d-)(A" is typed, do you want me to delete everything that is not a part and only about "W0842241A"? It might be easier to say which format is valid and ask you to type again if it is not, instead of trying to correct what was typed, since there are too many possibilities to be treated
– hkotsubo
hkotsubo, not so much!!! you may have noticed that my HTML has onKeyUp="Return formatterRG(this, Event);" so in the example "W0842241A", I know that the first character typed has to be a W, the second to the eighth has to be number and the last (ninth) can be A-Z, 0-9. I didn’t want to keep riding substring and take Char Code from Event and disregard what’s not expected. I thought there’d be something cleaner using regex.
– Dennys
Could be clearer in your question, because in the body of the question is like "I need help with 2 regular expressions in Javascript ..." and your code works but down there in the @Felipealmeida answer you say your question is different. I will mark as a non-reproducible problem because for me it makes no sense a question that does not present error and when a user tries to answer it changes its scope.
– Augusto Vasques
Augusto my code doesn’t work as it should, that’s why I asked for help! You haven’t read through the 2 rules I need to implement, I figured someone with extensive knowledge of regex would be able to tell me if there is a regular expression that removes characters that don’t fit rules 1 and 2 and keeps only the characters that follow the rule, or even give me the proper guidelines to be able to use regex for such a purpose, because with substrings, ifs, charCode, I already have!
– Dennys