Regular expression: how to apply "Negative Lookahead" to only one capture group?

Asked

Viewed 58 times

1

In a domain validation task, I need a regular expression to validate them by following some rules, among them, where the domain name does not only contain numbers.

Therefore, I created a regular expression (regex) that validates the domain, but I could not make it fail in case of only numbers in the domain name.

In short, the regular expression is failing by allowing domains that contain only numbers, as they are invalid domains and the expression is allowing the same.

Regular expression:

^(^[a-z][a-z0-9]{0,30}\.)?([a-z0-9](?:[a-z0-9-]{0,24}[a-z0-9])?)(\.[a-z]{2,4}(?:\.[a-z]{2})?)$

In the expression above, I’m capturing three groups:

  • Group 1: subdomain;
  • Group 2: domain name;
  • Group 3: TLD.

I’m trying to apply a Negative Lookahead only in group 2, but example I found applies only to whole result.

I am trying to apply only in group 2 (which captures the domain), but without success.

I made an example in Regexr to demonstrate better what I’m trying to do.

1 answer

1


Just put the Lookahead within group 2, and modify it a little: the link you saw uses the markers ^ and $, which indicate the start and end of the string, so they check the entire string.

If you do not use these markers, the Lookahead checks from its position, so group 2 would look like this:

((?!\d+\.)[a-z0-9](?:[a-z0-9-]{0,24}[a-z0-9])?)
 ^^^^^^^^^ basta adicionar isso

Within the Lookahead we have \d+ (one or more digits), followed by a dot (\.). That is, the Negative Lookahead checks whether from that position, nay there is a sequence of only numbers followed by a point.

Like the Lookahead always checks from the current position, put it right at the beginning of group 2 is enough, because it is from there that this check will be made.

Another detail is that in group 1 you do not need to repeat the marker ^, for it has been put before at the beginning of the regex.

The whole regex looks like this:

^([a-z][a-z0-9]{0,30}\.)?((?!\d+\.)[a-z0-9](?:[a-z0-9-]{0,24}[a-z0-9])?)(\.[a-z]{2,4}(?:\.[a-z]{2})?)$

See here her working.

  • Thank you very much! It worked perfectly! I think I now understand the operation of the Lookahead Negative! Thank you very much indeed!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.