Remove numbers at the end of a Regex string C#

Asked

Viewed 1,976 times

7

I have a string that contains the name of some records.

Examples :

string nome = "MARIA APARECIDA DE SOUZA MOURA 636598241";
string nome = "MARIA APARECIDA DE SOUZA MOURA 2018";

I would like to remove the numbers only when they are at the end of the text and where the number of numeric characters exceeds 4.

Examples:

MARIA APARECIDA DE SOUZA MOURA 636598241 would be: MARIA APARECIDA DE SOUZA MOURA

and

MARIA APARECIDA DE SOUZA MOURA 2017 unchanged as it contains only 4 numeric characters

I made a few attempts with the Regex, but so far unsuccessful

4 answers

6


  • To extract the numbers at the end of the string use Regex @"\d+$"
  • Check that the string’s character output is greater than 4
  • Utilize Replace and replaced with nothing.

See working on dotnetfiddle.

string nome = "MARIA APARECIDA DE SOUZA MOURA 6365945";
    
var numeros = Regex.Match(nome, @"\d+$").Value;
    
nome = numeros.Length > 4 ? nome = nome.Replace(numeros,"") : nome;

Reading the regex

\d is a shorthand which is a shortcut to the set [0-9], i.e., search for numerical values.

+ is a quantifier which seeks one or more elements, is the same as {1,}

$ is a edge searching at the end of the text

Another way that follows the same line of thought: dotnetfiddle

4

You can use the expression \d+$, she tries to find any digits that are at the end of string.

Building an instance of Regex with this expression, you can validate the size of the substring which was found and then replace if this string was found has more than 4 characters.

Take an example:

using static System.Console;
using System.Text.RegularExpressions;

public class Program
{   
    public static string Remover4DigitosFinais(string input)
    {   
        var expressao = new Regex(@"\d+$");     
        var r = expressao.Match(input);     
        return r.Length > 4 ? expressao.Replace(input, "").TrimEnd() : input;
    }

    public static void Main()
    {
        var validacoes = new [] 
        {
            new { Input = "MARIA 2 APARECIDA DE SOUZA MOURA 636598241", Esperado = "MARIA 2 APARECIDA DE SOUZA MOURA" },
            new { Input = "MARIA 2 APARECIDA DE SOUZA MOURA 2018", Esperado = "MARIA 2 APARECIDA DE SOUZA MOURA 2018" },
            new { Input = "JOAO 175", Esperado = "JOAO 175" },
            new { Input = "JOAO 1751233", Esperado = "JOAO" },
        };

        foreach(var val in validacoes)
        {
            var novo = Remover4DigitosFinais(val.Input);            
            var sucesso = (novo == val.Esperado);

            WriteLine($"Sucesso: {sucesso} - Entrada: {val.Input} - Saída: {novo} - Esperado: {val.Esperado}");
        }           
    }   
}

See working on . NET Fiddle.

This is certainly great if you want to take the responsibility of regex and consequently have a shorter and easier to understand expression.

Otherwise, you can simply use the expression (\s\d{5,})+$, she tries to find any substring where the first character is a space (\s), after this space there are digits (\d), which are at the end of string leading ($) as long as this combination is larger than five ({5,}).

public static string Remover4DigitosFinais(string input)
{   
    var expressao = new Regex(@"(\s\d{5,})+$");             
    return expressao.Replace(input, "");
}

See working on . NET Fiddle.

1

The search for the end would be +$ with the search for numbers with more than 4 digits [0-9]{5,}, and the final expression also checking the space: "(\\s[0-9]{5,})+$":

string nome0 = "0 - MARIA APARECIDA DE SOUZA MOURA 636598241";
string nome1 = "1 - MARIA APARECIDA DE SOUZA MOURA 2018";
string nome2 = "2 - MARIA APARECIDA DE SOUZA 636598241 MOURA";

string strRegex = "(\\s[0-9]{5,})+$";

string resu0 = Regex.Replace(nome0, strRegex, "");
string resu1 = Regex.Replace(nome1, strRegex, "");
string resu2 = Regex.Replace(nome2, strRegex, "");

1

A hint is to use a character as a separator and use the function. slip() turning your string into an array, so you have freedom to pick up the name, or just the number at any time with ease.

Example:

string nomeNumero = "MARIA APARECIDA DE SOUZA MOURA | 636598241";
string nome = nomeNumero.Split("|")[0];
string numero = nomeNumero.Split("|")[1];

Browser other questions tagged

You are not signed in. Login or sign up in order to post.