Doubt, java regular expression

Asked

Viewed 1,457 times

6

I have the following regular expressions. The first one validates words and is right. The problem is the second one is to validate directory, for example "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt". I am not managing to make her worth such a path. Can anyone help me?

public class validador {

    public boolean validarPalavra(String palavra) {
       Pattern p = Pattern.compile("[A-Z0-9a-z]*");
       Matcher retorno = p.matcher(palavra);
       return retorno.matches();
    }

    public boolean validarCaminho(String caminho) {
       Pattern p = Pattern.compile("//([a-zA-z0-9])+");
       Matcher retorno = p.matcher(caminho);
       return retorno.matches();
    }
}

2 answers

7


First, that the Pattern is an expensive object to be created. However, it is immutable and reusable, and therefore the best is that each one be created only once each.

Second, which you used in the second expression [a-zA-z0-9]. The a-z are the tiny and the 0-9 are numbers. But the A-z is wrong because the z should be uppercase. But still the regular expression should be much more complicated. The correct regular expression (one of the possible ones) would be:

^(?:(?:[A-Z]\:)?\/)?(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*)(?:\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))*(?:\.[a-zA-Z0-9]+)?|[A-Z]\:\/?$

Explanation of regular expression:

  • ^ - String start.

  • (?:(?:[A-Z]\:)?\/)? - Here are some things:

    • (?: ... ) - It serves to group without capturing. We have two groups of it.
    • [A-Z] - A capital letter.
    • \: - The character : after the capital letter.
    • (?:[A-Z]\:)? - The capital letter followed by : may or may not appear (because of ?).
    • \/ - The character /, which may be after the capital letter followed by : or at the very beginning of the string.
    • The last ? of (?:(?:[A-Z]\\:)?\/)?. It means that the / or the capital letter followed by :/ may be omitted.

    That is, this part serves to recognize the prefix of the path. So, in the paths of type C:/texto, /texto and only texto, this is responsible for recognizing what precedes the texto.

  • (?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*) - The name of a directory. Here we also have several things:

    • (?: ... ) - Two groupings without capture.
    • [a-zA-Z0-9]+ - A word. It has to have at least one letter (because of the +).
    • (?: [a-zA-Z0-9]+)* - A space followed by a word. The * signals that it can occur zero or more times. Whenever a space occurs, there should be a word right away.

    In this way, a directory name consists of a set of one or more words separated by space. Consecutive multiple spaces are not allowed. Spaces at the end or at the beginning of the name are not allowed.

  • (?:\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))* - Here we have four things:

    • (?: ... ) - Group without making a catch.
    • \/ - The character /.
    • (?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*) - Same as previously shown. This is a directory name occurring just after the /.
    • * - Repeat the whole group as many times as necessary (including possibly zero times).

    That is, this part recognizes all the \palavras after the first term. Even if there is no \palavras after the first term.

  • (?:\.[a-zA-Z0-9]+)? - Again four things:

    • (?: ... ) - Another group without catching.
    • \. - The character ..
    • [a-zA-Z0-9]+ - A word after the .. It has to have at least one letter. Note that no spaces are allowed here (this part is the file extension).
    • ? - The group may or may not appear.

    Therefore, this part recognizes the .extensão at the end, which is optional.

  • |[A-Z]\:\/? - Everything that exists before recognizes the complete path. However since in the previous part of all this, the first word is obligatory, then paths such as C: and C:\ would not be recognized. Therefore, we have the | (which means this is an alternative if what you have before fails) followed by a capital letter ([A-Z]), one : and a / optional (\/?).

  • $ - String end.

It is still important to note that in java the character \ is used in strings for escape sequences (such as \n for line breaks). Since we do not want to use escape sequences, use the character itself \, so inside the string we have to use \\ to represent \. So, to build the \/, the \. and the \: regular expression, in source code we have to use \\/, \\. and \\:, as you can see in the code below:

import java.util.regex.Pattern;

public class Validador {

    private static final Pattern p1 = Pattern.compile("^[A-Z0-9a-z]*$");

    private static final Pattern p2 =
            Pattern.compile("^(?:(?:[A-Z]\\:)?\\/)?(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*)+(?:\\/(?:[a-zA-Z0-9]+(?: [a-zA-Z0-9]+)*))*(?:\\.[a-zA-Z0-9]+)?|[A-Z]\\:\\/?$");

    public static boolean validarPalavra(String palavra) {
        return p1.matcher(palavra).matches();
    }

    public static boolean validarCaminho(String caminho) {
        return p2.matcher(caminho).matches();
    }
}

Finally, it is worth noting that in your first validator, you are using [A-Z0-9a-z]* instead of [A-Z0-9a-z]+ (that is, with * instead of +). This means that it will also accept an empty string. If this is not intentional, then just change the * for +. In addition, I also added the ^ and the $ on it to mark the beginning and the end of the string.

Well, here are some tests:

public class Main {
    private static void testar(boolean resultado, String teste) {
        System.out.println(Validador.validarCaminho(teste) == resultado ? "Ok" : "ERRO");
    }

    public static void main(String[] args) {
        testar(true, "C:/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "C:/home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "/home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(true, "home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto.txt");
        testar(true, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto");
        testar(true, "home/Paulo Neto/Net Beans Projects/Expre/src/expre/texto");
        testar(true, "home");
        testar(true, "/home");
        testar(true, "C:/home");
        testar(true, "home.txt");
        testar(true, "/home.txt");
        testar(true, "C:/home.txt");
        testar(true, "C:");
        testar(true, "C:/");
        testar(false, "a:");
        testar(false, "a:/");
        testar(false, " home");
        testar(false, "home ");
        testar(false, "home/");
        testar(false, "home.");
        testar(false, ".txt");
        testar(false, "C:home");
        testar(false, "C:home/texto");
        testar(false, "home//texto.txt");
        testar(false, "ho  me/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.");
        testar(false, "home/PauloNeto/NetBeans#Projects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeans  Projects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto..txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.x.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt.");
        testar(false, " E:/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E :/home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E: /home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E:/ home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "E:/home /PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home /PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, " home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/ PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto /NetBeansProjects/Expre/src/expre/texto.txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto. txt");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.txt ");
        testar(false, "home/PauloNeto/NetBeansProjects/Expre/src/expre/texto.t xt");
        testar(false, "E: ");
        testar(false, "E :");
        testar(false, " E:");
        testar(false, "E:/ ");
        testar(false, "E: /");
        testar(false, "E :/");
        testar(false, " E:/");
        testar(false, "");
        testar(false, " ");
        testar(false, "/");
        testar(false, ".");
        testar(false, ":");
    }
}

In all the tests the output was "ok".

See here working on ideone.

  • Victor Stafusa, I understood what you meant, I did it my way and still returning "false", I copied your code and is still returning false

  • @Pauloneto I added tests and put it online in ideone. It’s working. If it still goes wrong for you, which string are you trying?

  • really tested the way you told me it worked, but in mine it won’t, it follows the code posted in ideone http://ideone.com/ckjKCv

  • @Pauloneto Strange. What is it that you’re typing that won’t?

  • @Victorstafuse I typed an example of this that is in yours and returns false

  • @Pauloneto Which example? Typed without quotation marks? You used / or \\?

Show 2 more comments

2

Do it:

public boolean validarCaminho(String caminho) {
       Pattern p = Pattern.compile("[a-zA-Z0-9\\.\\/]+");
       Matcher retorno = p.matcher(caminho);
       return retorno.matches();
}

Browser other questions tagged

You are not signed in. Login or sign up in order to post.