Validate regular expression name and surname

Asked

Viewed 19,971 times

16

I need to create a regular expression that validates whether the format of the user’s name and surname is valid, for example:

Maria Silva | true
Maria  Silva | false
Maria silva | false
maria Silva | false
MariaSilva  | false

That is, it can only accept when the First and Last names are in capital letters and with only one space between words, I tried to use this expression:

[A-Z][a-z].* [A-Z][a-z].*

But so you’re accepting modes that don’t fit the pattern I need.

  • 2

    It would not be better to apply titled case to the backend? Maria Silva it’s simple ever thought when it’s something like Maria da Silva e Silva?

  • @rray, how could I do that? I could post an answer with an example?

  • 2

    @Sergio killed the riddle. The asterisk point giving the problem

  • 2

    @Jeffersonquesado I saw, but the question raised by rray makes sense

  • Yeah, it makes total sense. And you can still get foreign names like McFarlene that has mixed lowercase and uppercase. But it depends on your data universe how to treat

  • But I’m not finding an example of how to use titled case to test, if you have an example to give

  • @danieltakeshi but what rray quoted would be this example of name Maria da Silva e Silva and so in his expression also does not catch

  • 2

    What about names with accents or with more than two components such as "Getúlio Dornelles Vargas"?

  • @Exact victorstafusa, though I haven’t found any example of how to meet that need, you would have some example?

  • 3

    What about "Juscelino Kubitschek de Oliveira"? Conjunctions can also be tiny. It turns out you have to define what it is you want. You want to validate if the entry is a valid full name?

  • The very nice link to help you develop new patterns... https://www.piazinho.com.br/ed5/exemplos.html#163

Show 6 more comments

5 answers

18


The problem is that you are using dot, which is for any character except line terminators. Getting it out will already work.

. Matches any Character (except for line terminators)

function valida(nome){
return !!nome.match(/[A-Z][a-z]* [A-Z][a-z]*/);
}

const testes = ["Maria Silva", "Maria  Silva", "Maria silva", "maria Silva", "MariaSilva"];
const resultados = testes.map(valida);
console.log(resultados);


Regex for names is always complicated because there are names from countries more complex than Rosa or Maria. For example "Åsa Ekström", "John Ó Súilleabháin" or "Gregor O'Sulivan".

In these cases the regex can become absurdly complex. And it is difficult to match all variants... yet a suggestion can be

function valida(nome) {
  return !!nome.match(/^[A-ZÀ-Ÿ][A-zÀ-ÿ']+\s([A-zÀ-ÿ']\s?)*[A-ZÀ-Ÿ][A-zÀ-ÿ']+$/) + ' ' + nome;
}

const testes = ["Maria Silva", "Åsa Ekström", "John Ó Súilleabháin", "Gregor O'Sulivan", "Maria  Silva", "Maria silva", "maria Silva", "MariaSilva"];
const resultados = testes.map(valida);
console.log(resultados);

  • 2

    I couldn’t visualize that point! Eagle eyes!

  • 2

    It worked perfectly, thank you

  • 2

    @Miguel just to make the test look better with true|false and not with the whole result of .match :P

18

TL;DR

The regex is:

^(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*(?: (?:(?:e|y|de(?:(?: la| las| lo| los))?|do|dos|da|das|del|van|von|bin|le) )?(?:(?:(?:d'|D'|O'|Mc|Mac|al\-))?(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+|(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*))+(?: (?:Jr\.|II|III|IV))?$

According to the page of regex Unicode, the \p{IsLatin} is supported by Java, C#, PHP, Perl and Ruby. You can test this regular expression on the regexplanet.

Detailed explanation

First, let’s set some rules for full names:

  • At least two names (although there are people who only have one name, such as Emperor Akihito of Japan, but let’s leave them out).
  • Exactly a space separating names.
  • The same name or surname may be composed, separated by hyphen. There may be more than one hyphen (e.g.: "Louis Auguste of Saxe-Coburg and Gotha")
  • Accepting accents.
  • The initial of each word must be uppercase and the other lower case letters.
  • Must accept entirely minuscule conjunctions.
  • Last names as "O'Brian", "d'Alembert" and "Mcdonald" must be accepted.
  • There cannot be two or more conjunctions in a row (e.g.: "of the"). Meanwhile "María Antonieta de las Nieves" is a valid name, so the conjunction can be composed.
  • Names and surnames (but not conjunctions) must have at least two letters.
  • Apostrophes between letters are allowed (e.g.: "Samuel Eto'o"). But they can’t be at the beginning or the end of the word and they can’t be consecutive.
  • Some names such as "Martin Luther King Jr." and "William Henry Gates III" has suffixes.

To do this with regex, we will stipulate the following:

  • The conjunctions are "and", "y", "of", "lo", "de los", "de la", "of las", "of", "of", "of", "of", "del", "van", "von", "bin" and "le".

  • Surnames can be prefixed with "d'", "D'", "The'", "Mc", "Mac" or "al-".

  • The suffixes are "Jr.", "II", "III" and "IV".

So the structure of the name would be this:

NOME-COMPLETO := PRENOME (espaço [CONJUNÇÃO espaço] SOBRENOME)+ (espaço SUFIXO)?
SOBRENOME := (PREFIXO)? NOME | PRENOME
PRENOME := NOME ("-" NOME)*
NOME := MAIÚSCULA (("'")? MINÚSCULA)+
PREFIXO := "d'" | "O'" | "Mc" | "Mac" | "al-"
SUFIXO = "Jr." | "II" | "III" | "IV"
CONJUNÇÃO := "e" | "y" | "de" (" lo" | " los" | " la" | " las")? | "do" | "dos" | "da" | "das" | "del" | "van" | "von" | "bin" | "le"
MAIÚSCULA := [\p{Lu}&&[\p{IsLatin}]]
MINÚSCULA := [\p{Ll}&&[\p{IsLatin}]]

That rule [\p{Lu}&&[\p{IsLatin}]] is responsible for recognizing a character at the intersection of the set of uppercase letters (\p{Lu}) and Latin characters (\p{IsLatin}). Therefore, this also accepts capitalized Latin characters. The (\p{Ll}) is for lowercase letters. See more about character classes in this other answer of mine and also at this link.

The above rule set can be read as a context-free grammar. However, it can be reduced to a regular expression, since there are no recursive rules in it. To do this, just replace the rules that are below the above rules.

However, how to build this regex manually is a boring, laborious, very error-prone process and the resulting regex is a monstrosity, especially if you have to change something from time to time, I made a program that builds the corresponding regex and also tests it with several different names. Here is the program (in Java):

Building and testing the regex

import java.util.regex.Pattern;
import java.util.StringJoiner;

class TesteRegex {

    private static final String MAIUSCULA = "(?:[\\p{Lu}&&[\\p{IsLatin}]])";
    private static final String MINUSCULA = "(?:[\\p{Ll}&&[\\p{IsLatin}]])";

    private static final String PREFIXO = choice("d'", "D'", "O'", "Mc", "Mac", "al\\-");
    private static final String SUFIXO = choice("Jr\\.", "II", "III", "IV");
    private static final String CONJUNCAO = choice("e", "y", "de" + opt(choice(" la", " las", " lo", " los")), "do", "dos", "da", "das", "del", "van", "von", "bin", "le");
    private static final String NOME = MAIUSCULA + plus(opt("'") + MINUSCULA);
    private static final String PRENOME = NOME + star("\\-" + NOME);
    private static final String SOBRENOME = choice(opt(PREFIXO) + NOME, PRENOME);
    private static final String NOME_COMPLETO = "^" + PRENOME + plus(" " + opt(CONJUNCAO + " ") + SOBRENOME) + opt(" " + SUFIXO) + "$";

    private static String opt(String in) {
        return "(?:" + in + ")?";
    }

    private static String plus(String in) {
        return "(?:" + in + ")+";
    }

    private static String star(String in) {
        return "(?:" + in + ")*";
    }

    private static String choice(String... in) {
        StringJoiner sj = new StringJoiner("|", "(?:", ")");
        for (String s : in) {
            sj.add(s);
        }
        return sj.toString();
    }

    private static final Pattern REGEX_NOME = Pattern.compile(NOME_COMPLETO);

    private static final String[] NOMES = {
        "Maria Silva",
        "Pedro Carlos",
        "Luiz Antônio",
        "Albert Einstein",
        "João Doria",
        "Barack Obama",
        "Friedrich von Hayek",
        "Ludwig van Beethoven",
        "Jeanne d'Arc",
        "Saddam Hussein al-Tikriti",
        "Osama bin Mohammed bin Awad bin Laden",
        "Luís Inácio Lula da Silva",
        "Getúlio Dornelles Vargas",
        "Juscelino Kubitschek de Oliveira",
        "Jean-Baptiste le Rond d'Alembert",
        "Pierre-Simon Laplace",
        "Hans Christian Ørsted",
        "Joseph Louis Gay-Lussac",
        "Scarlett O'Hara",
        "Ronald McDonald",
        "María Antonieta de las Nieves",
        "Pedro de Alcântara Francisco António João Carlos Xavier de Paula Miguel Rafael Joaquim José Gonzaga Pascoal Cipriano Serafim",
        "Luís Augusto Maria Eudes de Saxe-Coburgo-Gota",
        "Martin Luther King Jr.",
        "William Henry Gates III",
        "John William D'Arcy",
        "John D'Largy",
        "Samuel Eto'o",
        "Åsa Ekström",
        "Gregor O'Sulivan",
        "Ítalo Gonçalves"
    };

    private static final String[] LIXOS = {
        "",
        "Maria",
        "Maria-Silva",
        "Marcos E",
        "E Marcos",
        "Maria  Silva",
        "Maria Silva ",
        " Maria Silva ",
        "Maria silva",
        "maria Silva",
        "MARIA SILVA",
        "MAria Silva",
        "Maria SIlva",
        "Jean-Baptiste",
        "Jeanne d' Arc",
        "Joseph Louis Gay-lussac",
        "Pierre-simon Laplace",
        "Maria daSilva",
        "Maria~Silva",
        "Maria Silva~",
        "~Maria Silva",
        "Maria~ Silva",
        "Maria ~Silva",
        "Maria da da Silva",
        "Maria da e Silva",
        "Maria de le Silva",
        "William Henry Gates iii",
        "Martin Luther King, Jr.",
        "Martin Luther King JR",
        "Martin Luther Jr. King",
        "Martin Luther King Jr. III",
        "Maria G. Silva",
        "Maria G Silva",
        "Maria É Silva",
        "Maria wi Silva",
        "Samuel 'Etoo",
        "Samuel Etoo'",
        "Samuel Eto''o"
    };

    private static void testar(String nome) {
        boolean bom = REGEX_NOME.matcher(nome).matches();
        System.out.println("O nome [" + nome + "] é bom? " + (bom ? "Sim." : "Não."));
    }

    public static void main(String[] args) {
        System.out.println("Regex: " + NOME_COMPLETO);

        System.out.println();
        System.out.println("Esses nomes devem ser bons:");
        for (String s : NOMES) {
            testar(s);
        }

        System.out.println();
        System.out.println("Esses nomes devem ser ruins:");
        for (String s : LIXOS) {
            testar(s);
        }
    }
}

This program builds regex using no-catch groups ((?: ... )), zero-or-more times operator (*), one-or-more times operator (+), one-time or-no-time operator (?), string start (^) and end of string ($).

See here the program working on ideone. Here is the exit from this program:

Regex: ^(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*(?: (?:(?:e|y|de(?:(?: la| las| lo| los))?|do|dos|da|das|del|van|von|bin|le) )?(?:(?:(?:d'|D'|O'|Mc|Mac|al\-))?(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+|(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*))+(?: (?:Jr\.|II|III|IV))?$

Esses nomes devem ser bons:
O nome [Maria Silva] é bom? Sim.
O nome [Pedro Carlos] é bom? Sim.
O nome [Luiz Antônio] é bom? Sim.
O nome [Albert Einstein] é bom? Sim.
O nome [João Doria] é bom? Sim.
O nome [Barack Obama] é bom? Sim.
O nome [Friedrich von Hayek] é bom? Sim.
O nome [Ludwig van Beethoven] é bom? Sim.
O nome [Jeanne d'Arc] é bom? Sim.
O nome [Saddam Hussein al-Tikriti] é bom? Sim.
O nome [Osama bin Mohammed bin Awad bin Laden] é bom? Sim.
O nome [Luís Inácio Lula da Silva] é bom? Sim.
O nome [Getúlio Dornelles Vargas] é bom? Sim.
O nome [Juscelino Kubitschek de Oliveira] é bom? Sim.
O nome [Jean-Baptiste le Rond d'Alembert] é bom? Sim.
O nome [Pierre-Simon Laplace] é bom? Sim.
O nome [Hans Christian Ørsted] é bom? Sim.
O nome [Joseph Louis Gay-Lussac] é bom? Sim.
O nome [Scarlett O'Hara] é bom? Sim.
O nome [Ronald McDonald] é bom? Sim.
O nome [María Antonieta de las Nieves] é bom? Sim.
O nome [Pedro de Alcântara Francisco António João Carlos Xavier de Paula Miguel Rafael Joaquim José Gonzaga Pascoal Cipriano Serafim] é bom? Sim.
O nome [Luís Augusto Maria Eudes de Saxe-Coburgo-Gota] é bom? Sim.
O nome [Martin Luther King Jr.] é bom? Sim.
O nome [William Henry Gates III] é bom? Sim.
O nome [John William D'Arcy] é bom? Sim.
O nome [John D'Largy] é bom? Sim.
O nome [Samuel Eto'o] é bom? Sim.
O nome [Åsa Ekström] é bom? Sim.
O nome [Gregor O'Sulivan] é bom? Sim.
O nome [Ítalo Gonçalves] é bom? Sim.

Esses nomes devem ser ruins:
O nome [] é bom? Não.
O nome [Maria] é bom? Não.
O nome [Maria-Silva] é bom? Não.
O nome [Marcos E] é bom? Não.
O nome [E Marcos] é bom? Não.
O nome [Maria  Silva] é bom? Não.
O nome [Maria Silva ] é bom? Não.
O nome [ Maria Silva ] é bom? Não.
O nome [Maria silva] é bom? Não.
O nome [maria Silva] é bom? Não.
O nome [MARIA SILVA] é bom? Não.
O nome [MAria Silva] é bom? Não.
O nome [Maria SIlva] é bom? Não.
O nome [Jean-Baptiste] é bom? Não.
O nome [Jeanne d' Arc] é bom? Não.
O nome [Joseph Louis Gay-lussac] é bom? Não.
O nome [Pierre-simon Laplace] é bom? Não.
O nome [Maria daSilva] é bom? Não.
O nome [Maria~Silva] é bom? Não.
O nome [Maria Silva~] é bom? Não.
O nome [~Maria Silva] é bom? Não.
O nome [Maria~ Silva] é bom? Não.
O nome [Maria ~Silva] é bom? Não.
O nome [Maria da da Silva] é bom? Não.
O nome [Maria da e Silva] é bom? Não.
O nome [Maria de le Silva] é bom? Não.
O nome [William Henry Gates iii] é bom? Não.
O nome [Martin Luther King, Jr.] é bom? Não.
O nome [Martin Luther King JR] é bom? Não.
O nome [Martin Luther Jr. King] é bom? Não.
O nome [Martin Luther King Jr. III] é bom? Não.
O nome [Maria G. Silva] é bom? Não.
O nome [Maria G Silva] é bom? Não.
O nome [Maria É Silva] é bom? Não.
O nome [Maria wi Silva] é bom? Não.
O nome [Samuel 'Etoo] é bom? Não.
O nome [Samuel Etoo'] é bom? Não.
O nome [Samuel Eto''o] é bom? Não.

Note that regex has accepted all the names it should accept and rejected all those it should reject. The regex produced is:

^(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*(?: (?:(?:e|y|de(?:(?: la| las| lo| los))?|do|dos|da|das|del|van|von|bin|le) )?(?:(?:(?:d'|D'|O'|Mc|Mac|al\-))?(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+|(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+(?:\-(?:[\p{Lu}&&[\p{IsLatin}]])(?:(?:')?(?:[\p{Ll}&&[\p{IsLatin}]]))+)*))+(?: (?:Jr\.|II|III|IV))?$

  • very cool, I learned a lot of new things from this regex. As I am learning and use regex101 or regExr, because it has visual elements and facilitates learning, you could insert a site that accepts [\p{Lu}&&[\p{IsLatin}]]? Or provide the link of any that works with visual elements? Thank you

  • 1

    @danieltakeshi It worked on http://www.regexplanet.com/advanced/java/index.html - But I had to change a few details in regex before to accept it (put \\ before the - and group the upper and lower case with (?:). I’ve already edited the answer to reflect this.

  • 3

    Very well explained and scary regex. I don’t know if you agree, but validating names with so many rules rarely pays off.

  • 3

    @bfavaretto I agree, it rarely pays. And even then, regex is not the tool I would normally use. But, I liked the challenge anyway. :)

  • +1 - it would be good to indicate in the TL;DR regex is for PHP.

  • @Sergio Actually, I did it in Java. But this regex should work in PHP.

  • @Victorstafusa ok, I don’t know the features of regex/Java, but for example there is no Javascript \p{L}

  • 1

    @Sergio Sim. Initially I tried to do it with javascript, but I broke my face because of it. Then, I had to do it in Java.

  • @Victorstafusa I don’t know if I did something wrong, but the validation of these is false: "Maria G. Silva","John D'Largy","Martin Luther King Jr.","Samuel Eto'o" but they are very unusual names in the Portuguese language and don’t apply much to us

  • 1

    @danieltakeshi "Maria G. Silva" has an abbreviated name in the middle, so it’s not correct. "John D'Largy" he expected how "d'Largy". "Martin Luther King Jr." has a point, he hoped "Martin Luther King Junior". As to "Samuel Eto'o" and "William Henry Gates III", these are harder.

  • 1

    @danieltakeshi I edited the answer to accept these cases.

  • 3

    what a comeback!

  • 1

    The Emperor of Japan does not have a good name :p

  • @Renan Yes. Thank you for giving a concrete case of someone with one name, Mr. Akihito.

  • Hello, I would like to ask a question about proper names with apostrophe, in particular names of Brazilian cities, Water, water or water? "Pau D'Arco" or "Pau d'Arco" (TO)? " Lambari D'Oeste" or "Lambari d'Oeste" (MT)? IBGE adopts one convention, the other city hall, the other Wikipedia...

Show 10 more comments

9

Regex

This is the Regex (monster): ^(?![ ])(?!.*[ ]{2})((?:e|da|do|das|dos|de|d'|D'|la|las|el|los)\s*?|(?:[A-ZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųūÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ∂ð'][^\s]*\s*?)(?!.*[ ]$))+$

In which you are qualified global, multine and Unicode /gmu

Or no special characters ^(?![ ])(?!.*[ ]{2})((?:e|da|do|das|dos|de|d'|D'|la|las|el|los)\s*?|(?:[A-Z][^\s]*\s*?)(?!.*[ ]$))+$

The links to Regex101 and of Regex Planet, with the validation of the tests can be accessed.

BS.: It’s extensive, so there are spaces for optimizations.

References

Using these global OS links:

Code generated by Regex101

const regex = /^(?![ ])(?!.*[ ]{2})((?:e|da|do|das|dos|de|d'|D'|la|las|el|los)\s*?|(?:[A-Z][^\s]*\s*?)(?!.*[ ]$))+$/gmu;
const str = `Maria  Silva
Maria silva
maria Silva
MariaSilva
 Maria Silva
Maria Silva 
Maria da Silva
Marina Silva
Maria / Silva
Maria . Silva
Maria Silva
Maria G. Silva
Maria McDuffy
Getúlio Dornelles Vargas
Maria das Flores
John Smith
John D'Largy
John Doe-Smith
John Doe Smith
Hector Sausage-Hausen
Mathias d'Arras
Martin Luther King Jr.
Ai Wong
Chao Chang
Alzbeta Bara
Marcos Assunção
Maria da Silva e Silva
Juscelino Kubitschek de Oliveira
Natalia maria
Natalia aria
Natalia orea
Maria dornelas
Samuel eto'
Maria da Costa e Silva
Samuel Eto'o
María Antonieta de las Nieves`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

Code in Excel VBA

Since a programming language has not been specified, it will be demonstrated a simple way to check in Excel the validation of the code, since the other answers already contemplate java.

Enable Regex in Excel

  1. Regex needs to be enabled, Enable the Developer mode
  2. In the 'Developer' tab, click 'Visual Basic' and the VBA window will open.
  3. Go to 'Tools' -> 'References...' and a window will open.
  4. Search for 'Microsoft Vbscript Regular Expressions 5.5', as in the image below. And enable this option.

Janela Referências

VBA code

Dim str As String
Dim objMatches As Object
Dim ws As Worksheet: Set ws = ThisWorkbook.Sheets(1)
Dim i As Long

lastrow = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
For i = 1 To lastrow
    str = CStr(Cells(i, 1))
    Set objRegExp = CreateObject("VBScript.RegExp") 'New regexp
    objRegExp.Pattern = "^(?![ ])(?!.*[ ]{2})((?:e|da|do|das|dos|de|d'|D'|la|las|el|los)\s*?|(?:[A-Z][^\s]*\s*?)(?!.*[ ]$))+$"
    objRegExp.Global = True
    Set objMatches = objRegExp.Execute(str)
    If objMatches.Count <> 0 Then
        For Each m In objMatches
            ws.Cells(i, 2) = m.Value
            ws.Cells(i, 2).Interior.ColorIndex = 4
        Next
    Else
        ws.Cells(i, 2).Interior.ColorIndex = 3
        ws.Cells(i, 2)=""
    End If
Next i

Upshot

Resultado

Code in PCRE

This is a code made by ctwheels in the question Validate Title Case Full Name with Regex.

What is this code:

(?(DEFINE)
    (?# Definitions )
    (?<valid_nameChars>[\p{L}\p{Nl}])
    (?<valid_nonNameChars>[^\p{L}\p{Nl}\p{Zs}])
    (?<valid_startFirstName>(?![a-z])[\p{L}'])
    (?<valid_upperChar>(?![a-z])\p{L})
    (?<valid_nameSeparatorsSoft>[\p{Pd}'])
    (?<valid_nameSeparatorsHard>\p{Zs})
    (?<valid_nameSeparators>(?&valid_nameSeparatorsSoft)|(?&valid_nameSeparatorsHard))
    (?# Invalid combinations )
    (?<invalid_startChar>^[\p{Zs}a-z])
    (?<invalid_endChar>.*[^\p{L}\p{Nl}.\p{C}]$)
    (?<invalid_unaccompaniedSymbol>.*(?&valid_nameSeparatorsHard)(?&valid_nonNameChars)(?&valid_nameSeparatorsHard))
    (?<invalid_overTwoUpper>(?:(?&valid_nameChars)*\p{Lu}){3})
    (?<invalid>(?&invalid_startChar)|(?&invalid_endChar)|(?&invalid_unaccompaniedSymbol)|(?&invalid_overTwoUpper))
    (?# Valid combinations )
    (?<valid_name>(?:(?:(?&valid_nameChars)|(?&valid_nameSeparatorsSoft))*(?&valid_nameChars)+(?:(?&valid_nameChars)|(?&valid_nameSeparatorsSoft))*)+\.?)
    (?<valid_firstName>(?&valid_startFirstName)(?:\.|(?&valid_name)*))
    (?<valid_multipleName>(?&valid_firstName)(?=.*(?&valid_nameSeparators)(?&valid_upperChar))(?:(?&valid_nameSeparatorsHard)(?&valid_name))+)
    (?<valid>(?&valid_multipleName)|(?&valid_firstName))
)
^(?!(?&invalid))(?&valid)$

And the validation and debug test on Regex 101 here

  • When I tried to pass your Regex suggestion to my code this Pattern was not accepted by accusing the following error: Multiple markers at this line&#xA;- Pattern cannot be resolved. It may lead to runtime erros.&#xA;- Groovy:illegal string body character after dollar sign;

  • And something else, Natalia aria he accepts and Natalia maria nay

  • "María Antonieta de las Nieves" - Alias, Little Key. That name works here?

  • @Good victorstafusa, do not insert them, la, el, but I will change

  • An update to try to make it symbol-proof /^(?![ ])(?!.*(?:\d|[ ]{2}|[!$%^&*()_+|~=\{\}\[\]:";<>?,\/]))(?:(?:e|da|do|das|dos|de|d'|D'|la|las|el|los|l')\s*?|(?:[A-ZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųūÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ∂ð'][^\s]*\s*?)(?!.*[ ]$))+$/, test on Regex101

0

I have been with this problem for validation of name structure, including in my system must accept foreign names.

So I was able to compile a Regex that meets the following specifications:

  1. At least 2 words (first and last name);
  2. Does not accept special characters, with the exception of the "'" quote
  3. It’s not case sensitive (because I handle the backend)
  4. Each word must have at least two characters

Follow the Regex

^((\b[A-zà-ã']{2,40} b) s*){2,}$

Can be tested on https://regex101.com/r/d3Cr6d/2

-1

Only letters, without differentiation of more and minuscule with space escapement

new Regexp(/ [a-z]{2,} [a-z]{2,}/gi). test('Italo Barros')

Browser other questions tagged

You are not signed in. Login or sign up in order to post.