AND operator in regex

Asked

Viewed 593 times

6

I have the following date/time format:

25/01/2017às11:53:37

And the following regex:

REGX_DATAHORA_DISTRIBUICAO = "(?<data>\d{1,2}\/\d{1,2}\/\d{4})|(?<hora>\d{1,2}:\d{1,2}:\d{1,2})"

    private OffsetDateTime getDataDistribuicao() {
    String textoData = replaceAndTrim(this.getPaginaInfoGerais().<HtmlTableCell>getFirstByXPath(XPATH_CEL_DATA_DISTRIBUICAO)
            .getTextContent());
    return LocalDateTime
            .parse(getDataDistribuicao(textoData),
                    DateTimeFormatter.ofPattern(PATTERN_DATA_HORA))
            .atOffset(ZoneOffset.UTC);
}

private String getDataDistribuicao(final String dataTexto)  {
    final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(dataTexto);
    if (matcherDataHora.find()) {
        return matcherDataHora.group();
    } else {
        throw new RegexException("Data distribuição", REGX_DATAHORA_MOVIMENTACAO.pattern(), dataTexto);
    }
}

The regex has 2 groups, but only one group is returned, the one of the date.. The other group of time returns as null. I imagine it’s on the operator’s account...
I’ve tried using (?=(positive lookahead), but maybe I used it wrong.
What to do?

  • which post regex output format you want to get?

  • an Offsetdatetime, I added the other method used.

  • https://ideone.com/S81daa

  • The reason is simple, you’re giving return, before going through all the groups. if (matcherDataHora.find()) { return matcherDataHora.group(); }, ie you just checked whether you gave match and gave return in the first group.

  • @Guilhermelautert yes, this Return failed to concatenate. But even so, the group 2 if you validate, is null. This same situation I found in the gringo forum. Even returning the two concatenated groups, the 2 is null in the same way.

2 answers

2

It has 2 groups, but the two groups return only part of the date: 25/01/2017.

In fact your groups return different things:

  1. (?<data>\d{1,2}/\d{1,2}/\d{4}) - Returns: 25/01/2017

  2. (?<hora>\d{1,2}:\d{1,2}:\d{1,2}) - Returns: 11:53:37

I took a test to prove it, you can see it here.

I imagine it’s on the operator’s account

I actually believe that you are not returning the entire result because you are using the method Matcher.group(), this method returns the matchs of a specific capture group, you can read the documentation on the use of it here.

What to do?

You can use:

private String getDataDistribuicao(final String dataTexto)  {
    final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(dataTexto);
    if (matcherDataHora.find()) {
        StringBuilder dataHora = new StringBuilder();
 dataHora.append(matcherDataHora.group("data")).append(matcherDataHora.group("hora"));
        return dataHora.toString;
    } else {
        throw new RegexException("Data distribuição", REGX_DATAHORA_MOVIMENTACAO.pattern(), dataTexto);
    }
}

If it doesn’t work, I suggest you try debugging the return values matcherDataHora.group("data") and matcherDataHora.group("hora"), if 1 of the two returns is empty, check if the input value you placed here is correct, because regex should capture this pattern.

  • Yes, I tested in Ruble this expression and return the two distinct groups same. But in debug, groups 1 and 2 return with the same date. The problem is not in concatenation, it is in the same value that comes only the date. I put the other method that is used to better understand. Thank you

  • Maybe I was wrong about the error being in the regex

  • when you debug and go in groups 1 and 2, they present the same content?

  • I find it odd that both return the same thing since the second capture group would not accept the data format xx/xx/xxxx, because it needs to match in ":"

  • yes, groups (0) and (1) return same content

  • @Laryssa Ahhhhhh, that’s different, group 0 denotes "Full match", while group 1 is only group 1 catch, so the problem is that group 2 is not giving match

  • Oh yes, I get it.. Well, the group(2) returns null. Strange that I tested this regex in the rubular with the same text: 25/01/2017às11:53:37 and it gives match in the 2 groups...

  • @Laryssa And I tested on Regexplanet and really from match, we will try to solve this problem via chat?

  • @Peace your fiddle link is dead :\

Show 5 more comments

2


Well, from what I researched, this problem is that the | (OR) operator in Java considers only one group. So what I did was go around it like I saw on some forums:

    private String getDataDistribuicao(final String dataTexto)  {
    String[] grupos = dataTexto.split("às");
    StringBuilder dataHora = new StringBuilder();
    for(String grupo: grupos){
        final Matcher matcherDataHora = REGX_DATAHORA_DISTRIBUICAO.matcher(grupo);
        if (matcherDataHora.find()) {
            dataHora.append(" ").append(grupo);
        } else {
            throw new RegexException("Data distribuição", REGX_DATAHORA_DISTRIBUICAO.pattern(), dataTexto);
        }
    }
    return dataHora.toString();
}
  • very good, hadn’t thought about it +1

Browser other questions tagged

You are not signed in. Login or sign up in order to post.