Catch multiple occurrences of an expression in parentheses

Asked

Viewed 157 times

1

I’m building a truth table, where I get an equation with several propositions (q, p, r, s) and connective (!, *, =).

My equation comes in the following format:

String equation = "!(q+p)*r=(s+r)";

The idea is that I can get the values from within the parentheses using a method similar to split, where I would get an array with all the values inside parentheses in the string, if there are several and if there are no parentheses in the string, something like:

equation[0] = (q+p)

equation[1] = (s+r)

With this, I think it would be simpler to remove the parentheses later and be able to calculate the values of the propositions. If possible, it would be interesting when there was denial (!), the variable would also return it:

equation[0] = !(q+p);

In case I need to remove the parentheses later, how would it look?

Ex:

String equation = (q+p)

equation[0] = q+p

1 answer

0

If you need to evaluate whichever possible type of equation, regex is not the best solution. I would recommend using some parser task-specific, as we will see below, regex is quite limited to this kind of thing.


If you only want to treat the simplest cases (just a couple of parentheses, without other nested parentheses), a solution (naive and limited, we’ll see why) would be:

List<String> equations = new ArrayList<String>();
Matcher matcher = Pattern.compile("!?\\([^()]+\\)").matcher("!(q+p)*r=(s+r)");
while (matcher.find()) {
    equations.add(matcher.group());
}
System.out.println(equations); // [!(q+p), (s+r)]

The regex has an exclamation (!), and the ? soon after indicates that she is optional. Then we have the opening parentheses - which in regex should be escaped with \, but within a string this should be written as \\, so the parentheses are \\(.

Then we have a character class denied: [^()]. It means "anything that nay be it ( nor )" (what confuses here is that inside brackets, the parentheses do not need to be escaped with \). And the quantifier + means "one or more occurrences", that is, we can have several characters that are not parentheses.

Finally, we have the closing of the parentheses: \\). That is, the regex takes a ! optional, followed by (, several characters that are not parentheses and ). I go through the pouch and I will save the results in a list. In the example above, the list contains two strings: "!(q+p)" and "(s+r)".


If you want to delete the parentheses, just use replaceAll:

for (String eq : equations) {
    System.out.println(eq.replaceAll("[()]", ""));
}

A regex [()] looking for ( or ), and the replaceAll replaces these with a String empty (""), which is the same as removing them. The result is:

!q+p
s+r

Of course, for the first expression the result is wrong, because !q+p is not the same as !(q+p), then you have to check if it makes sense to remove the parentheses.

One option would be to if (! eq.startsWith("!")) { faz o replace } only to remove parentheses if the expression does not start with !, for example. It’s not clear how you want to treat each case, but having the expression isolated, you can work it the way you need to.


As I said, the above regex is quite naive and error-prone. It only takes everything in parentheses, without validating whether it is actually a valid expression. So if you have things like ( ) or (.$@#%), regex will get it too. It only works if the expression is well formed. If you want a regex that checks any valid expression, then it is not worth it, because they are too possibilities.

You could even assume some simpler premises, such as "the propositions always have a letter", "there are no spaces or line breaks", "there will be no 'strange' characters, such as @#%", etc, but still there are other problems.

For example, if you have nested parentheses (like (q*(s+r))), it also does not serve, because we assume that within the parentheses there are no other parentheses. For these cases, up to would be possible to use recursive regex, but Java does not support this feature, and anyway is not the best way.

The regex would be very complicated, and in addition it would only check if the expression is right, but it would not evaluate the result - for this case you would need a parser specific, so why not use it from the beginning? Regex is cool, but is not always the best solution.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.