How to create a regex equivalent to a BNF?

Asked

Viewed 137 times

4

I need to create a regular expression that is equivalent to the following BNF:

<SEQ> ::= <DIG><SEQ> | <DIG>
<DIG> ::= 0|1

I tried to create a Python code with the regex library:

import re

reg = r'[0-2]'

print(reg)

It was a test, but I couldn’t actually create a regex that was equivalent to BNF.

  • And you are using which string to try to search for occurrences using the regular expression in question?

  • The regex would have to be [0-2], but depending on what you need may not be enough. Not to mention that you only created a string but are not actually using regex. If you can [Dit] the question and put more details (for example, which text you want to extract the numbers from, etc)

  • I didn’t get to include string, actually I’m very doubtful on this topic in question. If you can help me.

  • "but I couldn’t get the numerical sequence to show", what sequence?

  • I already changed the question, it would be a sequence of 0 to 2.

  • You’re still confused. Capturing a sequence of numbers is one thing (it sounds like you’re going to extract it from somewhere else). Displaying the sequence is something else. What exactly the program should do?

  • I changed the question, I need to create a program that displays a numerical sequence according to BNF: <SEQ> ::= <DIG><SEQ> | <DIG> <DIG> ::= 0|1

  • 1

    To know the true functioning of regular expressions Read this.

Show 3 more comments

1 answer

2


If the idea is to generate a regex equivalent to this BNF, then it would be something like this:

import re

r = re.compile('^[01]+$')

print(r.match('010100111')) # retorna um objeto Match
print(r.match('010120111')) # None

The regex uses the markers ^ and $ which indicate respectively the start and end of the string.

Then we have the character class [01] which means "one digit 0 or 1" and the quantifier + which indicates one or more occurrences. Thus, regex recognizes a sequence of zeros and ones of minimum size 1 and no maximum limit (exactly the same as BNF represents).

In the example above, we can see that if the string only has zeros and ones, a match, and if you have any different character, you find nothing.

Note: I could even use [0-1] instead of [01], but in this case it is the same ("a digit from 0 to 1" and "the digit 0 or 1" is basically the same thing).


Before the editing and the comments, it was not clear that the goal was to generate the above regex. Anyway, I will keep the version that tried to respond to original question.

If given a BNF, you want to create a sequence that matches the definition, then regex nay is the right tool for the task.

In case, you would have to implement at hand, or use some ready lib. An alternative is to use the NLTK. As I don’t use this kind of thing much, I ended up taking an example from here and adapting to your case:

from nltk import CFG
from nltk.parse.generate import generate

# cria a gramática a partir da BNF
grammar = CFG.fromstring(""" SEQ -> DIG SEQ | DIG
    DIG -> "0" | "1" """)

# gera várias "palavras" válidas da gramática
for production in generate(grammar, depth=5):
    print(' '.join(production))

In the above case, the exit was:

0 0 0
0 0 1
0 1 0
0 1 1
0 0
0 1
1 0 0
1 0 1
1 1 0
1 1 1
1 0
1 1
0
1

From there you can read the documentation (here and here) to generate exactly what you need.

For the grammar in question generates a sequence potentially infinite of zeros and ones (and not "a sequence from 0 to 2", as originally said), so it is unclear whether you need to generate any sequence whatever is valid, or some specific, or all of certain size, etc.

But the general idea is probably this...

  • Thank you so much for your help. I would need in case create a sequence with the regex library so that it represents the BNF I showed above, not exactly solve it, understand? But you’ve helped me enough.

  • @Camillamarques I updated the answer

  • Thank you so much! Now I understand better.

  • I found this playlist here on youtube tbm, very good: https://www.youtube.com/watch?v=wBI0yv2FG6U&list=PLbIBj8vQhvm1VnTa2Np5vDzCxVtyaYLMr&index=1

  • I have another question regarding this same theme, I needed help to create a regex equivalent to these two BNF’s: <list> ::= <element><list> | <element> <element> ::= <Letter><Digit> <Letter> ::= A | B <Digit> ::= 1|2 ________________________________________________________________________ <Exp> ::= <Exp> + <Exp> | <term> <term> ::= <term> * <term> | <factor> <factor> ::= ( <Exp> ) | <Digit> <Digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

  • 1

    @Camillamarques So it’s the case of open another question. The idea of the site is to have a question by specific problem - and in addition, a new question is visible on the main page for everyone to see (here in the comments less people will see, probably just me, which decreases the chances of having an answer) and so the site also gets more organized :-)

  • 1

    Done : https://answall.com/questions/482315/como-usar-regex-para-representa-uma-bnf-maiscomplex

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.