Why do sets with intervals of A-z return symbols in REGEX?

Asked

Viewed 334 times

5

Setting:

const texto = 'ABC [abc] a-c 1234';

console.log(texto.match(/[A-z]/g))

  • Why the set of A(maiúsculo) until z(minusculo), that is to say, /[A-z]/g returned me to [ and ]?
  • The result should not be: [A, B, C, a, b, c, a, c]?

2 answers

9


[A-z] will match ASCII characters in the sequence of A to z. If you look at ascii table below you will see that there are several other characters between A and z (including the square brackets [] that you don’t want).

inserir a descrição da imagem aqui

How you wish to marry only uppercase and lowercase letters, from A to Z and of a to z, the correct would be to use [a-zA-Z].

const texto = 'ABC [abc] a-c 1234';
console.log(texto.match(/[a-zA-Z]/g))

5

The crease or intervals follow the table Unicode.

I defined that my set should be A(maiúsculo) até z(minusculo), there are symbols in the middle of that range. It’s them: [ \ ] ^ _ `

Look at the Unicode table:

inserir a descrição da imagem aqui

Note:

The first 127 characters of the Unicode table are the same as in the table ASCII

Browser other questions tagged

You are not signed in. Login or sign up in order to post.