Regular expression for 2 specific cases

Asked

Viewed 1,246 times

1

I wanted two regexp to standardize a field. Follows the two values I need a regular expression.

  1. 1-167106651950

    A number from 0 to 9, a mandatory dash and 12 more numbers from 0 to 9.

  2. SP4:01:342310

    Two capital letters can be SP or RJ, a number from 1 to 9, two mandatory points, two digits from 0 to 9, two mandatory points and six digits from 0 to 9.

I created the following regexp for the first case, but it does not limit the number of digits I type.

([0-9\-]+)\-[0-9]+

For the second I can’t do anything.

3 answers

2


First let’s see why your regex doesn’t work.

The clasps ([]) correspond to a character class: they serve to indicate that you want any character within them.

For example, [abc] means "the letter a or the letter b or the letter c" (only one of them, any one serves). It is an expression that corresponds to only one character.

In case, is being used 0-9, meaning "the digits from 0 to 9". Therefore, [0-9] accepts any digit from 0 to 9. Note that the hyphen (-) has a special meaning, as it serves to define an interval between two characters. As you also put another hyphen, but with backslash before (\-), this happens to be interpreted as the hyphen character itself.

Therefore, [0-9\-] means "one digit from 0 to 9 or a hyphen". Any of these characters serves, as we can see in the example below:

console.log(/[0-9\-]/.test('1')); // true
console.log(/[0-9\-]/.test('-')); // true

Already the quantifier + means "one or more occurrences" than is immediately before it. So [0-9\-]+ means "one or more occurrences of digit or hyphen". That is, if it has multiple digits, it serves, and if it has multiple hyphens also serves (and any combination of these characters also serves).

console.log(/[0-9\-]+/.test('123112')); // true
console.log(/[0-9\-]+/.test('-----')); // true
console.log(/[0-9\-]+/.test('-1-3-432---111-')); // true

To get just a number from 0 to 9, just use [0-9] or \d (at the end of the reply I made a comment on [0-9] versus \d, since they are not always the same thing). For the hyphen, you don’t need to put the backslash if it’s outside the brackets, just put - and ready.

And to determine a specific amount, put this value between keys. In case, you want 12 digits, then just do [0-9]{12} (exactly 12 occurrences of any digit from 0 to 9).

I also recommend using the markers ^ and $, which means, respectively, the beginning and the end of the string. Thus, you ensure that the string only has what is determined inside the regex. So the regex for case 1 looks like this:

console.log(/^[0-9]-[0-9]{12}$/.test('1-167106651950')); // true
console.log(/^[0-9]-[0-9]{12}$/.test('123-167106651950')); // false
console.log(/^[0-9]-[0-9]{12}$/.test('1-A67106651950')); // false

If you don’t use the markers ^ and $, can end up with false positives:

console.log(/[0-9]-[0-9]{12}/.test('abc1-167106651950def')); // true

Note that the string starts with abc and ends with def. Even so the regex returns true, because it contains an excerpt that corresponds to the expression. Using ^ and $ you guarantee that she will only get what you specified.


For case 2, we can use the alternate operator | for the options "SP" or "RJ": (SP|RJ).

For the number from 1 to 9, we use the brackets: [1-9]. The two points are placed directly as :, and for the other digits we use the quantity between keys, in the same way that was done in case 1. Also be sure to use the markers ^ and $:

console.log(/^(SP|RJ)[1-9]:[0-9]{2}:[0-9]{6}$/.test('SP4:01:342310')); // true
console.log(/^(SP|RJ)[1-9]:[0-9]{2}:[0-9]{6}$/.test('RJ4:01:342310')); // true
console.log(/^(SP|RJ)[1-9]:[0-9]{2}:[0-9]{6}$/.test('AB4:01:342310')); // false


If you want a single regex that validates both cases, just join the previous expressions with |. In the example below I also use the class RegExp Javascript, to avoid repeating code:

let re = RegExp(/^([0-9]-[0-9]{12})|((SP|RJ)[1-9]:[0-9]{2}:[0-9]{6})$/);
console.log(re.test('SP4:01:342310')); // true
console.log(re.test('RJ4:01:342310')); // true
console.log(re.test('1-167106651950')); // true

console.log(re.test('123-167106651950')); // false
console.log(re.test('AB4:01:342310')); // false


[0-9] versus \d

Usually both are equivalent. The only detail is that depending on the language/engine/configuration, the \d may also correspond to other characters representing digits, such as the characters ٠١٢٣٤٥٦٧٨٩ (see this answer for more details).

In the case of Javascript, this option is disabled by default, so either use one or the other:

let re = RegExp(/^(\d-\d{12})|((SP|RJ)[1-9]:\d{2}:\d{6})$/);
console.log(re.test('SP4:01:342310')); // true
console.log(re.test('RJ4:01:342310')); // true
console.log(re.test('1-167106651950')); // true

console.log(re.test('123-167106651950')); // false
console.log(re.test('AB4:01:342310')); // false

Notice I didn’t change the [1-9] for \d, since \d takes all digits from 0 to 9, while [1-9] does not include zero.

2

/^\d-\d{12}$/

and

/^[A-Z]{2}[1-9]:\d{2}:\d{6}$/

If only SP and RJ are valid, use

/^(SP|RJ)[1-9]:\d{2}:\d{6}$/

For [SP|RJ]{2} accept SP, RJ, PS and JR.

  • Hello. Could you elaborate further on your answer?

  • 1

    Valeu. I didn’t know there was a difference between (SP|RJ) and [SP|RJ]. Thank you

0

Just improving the previous answer, in the second expression you said the letters will be SP or RJ, but with the regex /^[A-Z]{2}[1-9]:\d{2}:\d{6}$/ it would accept any set of two letters, to limit to SP or RJ utilize:

/^[SPRJ]{2}[1-9]:\d{2}:\d{6}$/
  • This solution better fits what I was asking for. Thank you

  • @The problem is that [SP|RJ]{2} accepts other strings, such as PR, J| and even ||, see here: https://regex101.com/r/9I9Md6/1

  • I edited the answer, now he won’t take it anymore || or J| for example, only the 4-letter merge SPRJ

  • Gabriel, [SPRJ]{2} still accepts strings as SS, PP or JP. The brackets indicate that any character inside them fits: [SPRJ] means "the letter S or the letter P, or the letter R, or the letter J". And the {2} indicates that you want 2 occurrences of these letters - whichever combination of them, see here. The problem is the clasps, you must remove them and use (SP|RJ) - see the 2 answers above that explain this. Inside the brackets, it doesn’t matter the order, [SPRJ] is the same as [SRJP], as both accept the same characters.

  • I didn’t remember that differentiation between [] and (), thanks for the reply.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.