Continue searching in the same Regex

Question

Continue searching in the same Regex

Asked 5 years, 5 months ago

Viewed 119 times

0

I have the string below:

Tel.: 324234 --  2 123123 (22)

I want to remove only the digits after "Tel.:", ie, 324234212312322, but using only 1 regex.

I can get the numbers after "Tel:" but not all after.

Already done on this site, can test?

https://regexr.com/4sspo

As I said I would like to use the same regex.

2 answers

Browser other questions tagged javascript regex

You are not signed in. Login or sign up in order to post.

by hkotsubo • **55,826** points · Answer 1 · 2020-01-27T11:58:51+00:00

regex only looks for patterns in a text, but the expression itself does not substitute. If you want to manipulate the found result, an alternative would be to first fetch the numbers you want (with the other characters - dash, spaces, etc.) and then delete the characters that are not digits:

let texto = `Casa - CEP 0334211 - Jardim Belém, Suzano, São Paulo
Quem recebe: Julio  Oliveira - Tel.: 145628 810 - 7469

Eu queria retirar 145628 + 810 + 7469
Ou seja: "1456288107469"

Só em 1 regex
Eu consigo fazer isso com outros passos mas gostaria de aprender a usar só 1 regex`;

//buscar os números depois de "Tel"
let match = texto.match(/Tel\.:((?:[^\d\n\r]*\d+)+)/);
if (match) {
    // eliminar os caracteres que não são dígitos
    console.log(match[1].replace(/\D+/g, '')); // 1456288107469
}

Note that I don’t need to put "Tel.:" inside a lookbehind, I find it unnecessary. Instead, I put the section I want to capture in parentheses, because it forms a catch group. And since it is the first pair of parentheses, then it will be group 1, which I can capture later in the first position of the array match (within the if, when I do match[1]).

Then I use a character class denied: [^\d\n\r]. This excerpt takes any character that nay be a digit (\d) or a line break (\n and \r). I did this because I understood that after "Tel.:", all the numbers are on the same line, and you want to catch them all until the end of this line. I also put the quantifier * (zero or more occurrences), indicating that they may or may not have several of these characters before a number.

Then we have \d+ (one or more digits). You had used \d*, but this expression means "zero or more digits" (that is, if it has no digit, it also serves). Using +, you ensure that you must have at least one digit.

This whole sequence (non-digits followed by digits) also has a + soon after, indicating that it can repeat itself several times (and the use of + ensures that it must occur at least once).

That is, everything after "Tel.:" (provided it is non-digits followed by digits, repeated several times, until the end of the line) will be in group 1. Then just take the value of group 1 (match[1]) and delete anything that is not digit (the \D+ within the replace - and note the use of flag g, for all occurrences to be replaced). What is left are only the numbers.

by Paz • **3,062** points · Answer 2 · 2020-01-24T19:32:54+00:00

I believe that capturing all digits after the word "Tel:" ignoring non-digit characters is impossible with a single regex execution.

regex works by analyzing character by character, capturing only that which satisfies the condition, so in my view the best way to capture what you want, is by using 2 regex:

A regex will capture all characters after "Tel:" until the line break.
A regex will parse the result of the first regex and return only the digits

Follow the above mentioned regex:

/(?<=Tel\.:\s).*$/m
/\d/g

Basic implementation in JS:

var rawData = `Casa - CEP 0334211 - Jardim Belém, Suzano, São Paulo
Quem recebe: Julio  Oliveira - Tel.: 145628 810 - 7469

Eu queria retirar 145628 + 810 + 7469
Ou seja: "1456288107469"

Só em 1 regex
Eu consigo fazer isso com outros passos mas gostaria de aprender a usar só 1 regex`

var regex1 = /(?<=Tel\.:\s).*$/m;
var regex2 = /\d/g

var fullMatch = rawData.match(regex1);

fullMatch.forEach((element) => {
  console.log('Resultado da primeira regex: ' + element);
  var result = element.match(regex2);
  console.log('Resultado da segunda regex: ' + result.toString().replace(/,/g,""));
//O replace junto ao toString() após o resultado é feito 
//para retirar o resultado do Array e remover todos os caracteres ",".
});