Usually the standard \w
(word, word) box any letter, number and underscore (_
). Thus, \w*
would match a string with any number of such characters (including the empty string), and \w+
the same thing, only it would require at least one character.
This would be the most complete solution, because define letters in a range (ex.: [a-zA-Z]
) would only consider ASCII characters, without accepting for example accented letters (á
). To get Java to agree to marry \w
with Unicode letters, just prefix the pattern with (?U)
[source]. If you are not interested in Unicode (i.e. only want ASCII letters), just omit this prefix (or use the alternative solutions given in the other answers - which would also be correct in this case).
If the presence of the underscore is a problem, we can eliminate it through a "double negation":
[^\W_]
That is: "marry everything that is not a 'no word' or an underscore". Example in the ruble. (Note: if not clear, \w
- lower case - box the character class "word"; \W
- uppercase - reverses, marrying everything that is not of this character class; [...]
box one of a set of characters; [^...]
reverses, marrying everything that is not one of these characters)
To use it, the simplest method is through String.matches
(checks if the entire string matches the expression passed as parameter), or if using Pattern
and Matcher
for other behaviors:
"abc".matches("(?U)[^\W_]*"); // true
Pattern p = Pattern.compile("(?U)[^\W_]*");
p.matcher("abc").matches(); // ok, a string inteira casa com o padrão
p.matcher("$a$").find(); // ok, o padrão pode ser achado na string
p.matcher("ab$").lookingAt(); // ok, a string começa com o padrão
What is expected behavior: You want regex to return true/false; or you want to remove undesirable characters from the string?
– Math
All you have to do is return true or false, @Math
– Geison Santos
@Math, if you can re-read the question and adapt it to your answer, thank you. I forgot an important detail: I need a string with number and/or letters).
– Geison Santos