Java Regular Expression

Asked

Viewed 3,396 times

4

I am programming in Java and need to filter a String from a regex using the method matches().

I can accept letters (a-z) or numbers (0-9), where that number can have 1 or n digits. I am using the following regex: [A-Z|a-z|\d{n} ].

The letter filter is working perfectly, but the number filter is not accepting a number with more than one digit (10 for example ), while single digit numbers are passing. I don’t know much about regex, but from what I’ve read, this should work.

1 answer

6

When you wear brackets in a regex, you’re asking her to marry one and only one character from the list indicated. Except for the special characters (-, \ and ^ [at the beginning]), everything inside the brackets is interpreted literally. This means that the regex:

[A-Z|a-z|\d{n} ]

Will accept the strings:

"A"
"B"
"Z"
"|"
"4"
"{"
"}"
" "

And it will reject any string with more than one character. If you want a regex that matches two or more rules combined with or (|), you have to do it out of brackets:

([A-Z]|[a-z]|\d{n})

or by simplifying:

([A-Za-z]|\d{n})

Note: you say "1 or n digits", but in this case she would accept exactly n digits. If what you want is even 1 or n, this way it should work:

([A-Za-z\d]|\d{n})

Assuming n is a number. But in a second reading, it seems to me that what you want is "one or more digits", would that be it? If it is, the correct is:

([A-Za-z]|\d+)

Example in the ideone.


P.S. In Java, parentheses are optional when using |, since each method of Matcher (matches, lookingAt, find) acts in a different way (eliminating the need for ^ and $ to match the entire string). But one should beware of the precedence of that operator. For example, regex:

^[A-Za-z]|\d+$

Is equivalent to:

(^[A-Za-z])|(\d+$)

And not the:

^([A-Za-z]|\d+)$

And therefore home aa or ###1. Example, correct example (Using ?: to avoid creating a catch group).

  • True, I looked a little further into the literature, and I found this bracket restraint. As for the mandatory parenthesis, I don’t believe it is mandatory in Java when using |. I was also able to solve the problem using the Regex class available in the Java utils. Thank you very much for your reply!

  • The interesting thing is that even using brackets, using the Regex class, were accepted two digits, must be some internal treatment. It looks like this (Pattern declaration ) : Pattern pattAlphaNum = Pattern.Compile("[A-Z|a-z| d+]); Pattern pattAlpha = Pattern.Compile("[A-Z|a-z]"); Pattern pattNum = Pattern.Compile("[ d+]");

  • @Natanaelramos How you’re wearing these Patterns? I tested them on ideone, see the outworking. P.S. Thank you for pointing out the question of mandatory parentheses, updated my answer.

  • My implementation was a little different, I followed the steps presented on the <http://ocpsoft.org/opensource/guide-to-regular-expressions-in-java-part-1/> code, if you can take a look. But you still have a problem, String "2a" for example, is recognized by pattNum and pattAlpha. Is this restriction of the brackets? For apart from them the result is still the same.

  • @Natanaelramos The function of the find is to find every marriage in a long string. For example, if you use the pattern \\d+ in string bla bla 10 bla 4 bla 123 bla he will find the 10 first, the 4 after and the 123 finally. How 2a contains a number and a letter, so: 1) both patterns recognize it via find; 2) none of them shall recognise it via matches; 3) only the pattNum must recognise it via lookingAt.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.