Regular expression that accepts only numbers and/or letters in Java

Asked

Viewed 43,625 times

3

How to develop a regular expression that allows a string to have only numbers and/or letters in any position and quantities?

Examples:

to) 00000a

b) 000000000A

c) AAAAAAAAA0

d) 1AAAAA1113

and) 1111111111111111111111a

f) aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa4

g) 12345676789090


EDITED

The expression must have numbers and optionally letters without graphic accentuation (cedilla does not enter, for example). My goal is to create a method to evaluate whether a given string matches the pattern, that is, the return will be true or false.

  • What is expected behavior: You want regex to return true/false; or you want to remove undesirable characters from the string?

  • All you have to do is return true or false, @Math

  • @Math, if you can re-read the question and adapt it to your answer, thank you. I forgot an important detail: I need a string with number and/or letters).

3 answers

4


Requiring at least one whole:

Pattern p = Pattern.compile("^[A-Za-z0-9]*\d+[A-Za-z0-9]*$"); // ou ^[^\W_]*\d+[^\W_]*$ seguindo a ideia do mgibsonbr
return p.matcher(textoDeTeste).matches();

You can test it on http://www.regexr.com/, but remember to mark the multline flag there in the upper right corner if you want to test with several different lines at the same time (each line being an input).

Link to the Rubular: http://rubular.com/r/ufBplCyRLv

  • Pedro, if you can re-read the question and adapt your answer, thank you. I forgot an important detail: I need a string with number and/or letters (no accent).

  • @Geisonsantos, modified according to the modified question.

4

Usually the standard \w (word, word) box any letter, number and underscore (_). Thus, \w* would match a string with any number of such characters (including the empty string), and \w+ the same thing, only it would require at least one character.

This would be the most complete solution, because define letters in a range (ex.: [a-zA-Z]) would only consider ASCII characters, without accepting for example accented letters (á). To get Java to agree to marry \w with Unicode letters, just prefix the pattern with (?U) [source]. If you are not interested in Unicode (i.e. only want ASCII letters), just omit this prefix (or use the alternative solutions given in the other answers - which would also be correct in this case).

If the presence of the underscore is a problem, we can eliminate it through a "double negation":

[^\W_]

That is: "marry everything that is not a 'no word' or an underscore". Example in the ruble. (Note: if not clear, \w - lower case - box the character class "word"; \W - uppercase - reverses, marrying everything that is not of this character class; [...] box one of a set of characters; [^...] reverses, marrying everything that is not one of these characters)

To use it, the simplest method is through String.matches (checks if the entire string matches the expression passed as parameter), or if using Pattern and Matcher for other behaviors:

"abc".matches("(?U)[^\W_]*"); // true

Pattern p = Pattern.compile("(?U)[^\W_]*");

p.matcher("abc").matches();   // ok, a string inteira casa com o padrão
p.matcher("$a$").find();      // ok, o padrão pode ser achado na string
p.matcher("ab$").lookingAt(); // ok, a string começa com o padrão
  • mgibsonbr, if you can reread the question and adapt your answer, thank you. I forgot an important detail: I need a string with number and/or letters (no accent).

  • @Geisonsantos There is not much to be adapted... If you want Unicode to be taken into account, prefix with (?U), If you don’t want to, don’t. I will add a note about this, and it is good to point out also that - in this case - the other answers are also correct (choose the one that pleases you most).

  • Right, @mgibsonbr. Ah, very good tip from Rubular.

  • What would be the difference between [^\W_] and [\w_]?

  • 2

    @Patrick, the first one doesn’t marry _, the second yes.

  • 2

    @Patrick Besides, [\w_] is redundant because \w alone already accepts _ (i.e. both are equivalent). [^...] denies everything that is inside the brackets. For example, if you wanted a regex to marry only letters (no numbers), you could do: [^\W\d_].

Show 1 more comment

2

public class Regex {

    public static boolean validaString(String str) {
        return str.matches("[a-zA-Z0-9]+");
    }

    public static void main(String[] args) {
        String[] teste = {"asdfads89as89", "", "asdf 98s", "asd©áßsas90", 
                          "asdfas78237", "2342", "abc",};
        //tudo o que não for de a->z ou de A->Z ou de 0->9 será removido
        for(String s: teste){
            String resultado = (validaString(s))?"\' é válida" :"\' é inválida";
            System.out.println("A String \'" + s + resultado);
        }
    }
}

Exit:

A String 'asdfads89as89' é válida
A String '' é inválida
A String 'asdf 98s' é inválida
A String 'asd©áßsas90' é inválida
A String 'asdfas78237' é válida
A String '2342' é válida
A String 'abc' é válida
  • Kyllopardiun, if you can re-read the question and adapt your answer, thank you. I forgot an important detail: I need a string with number and/or letters (no accent).

  • @Geisonsantos, adapted to the requested format.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.