Regular expression to recognize language: words that do not contain "bbab"

Asked

Viewed 1,194 times

23

The @LINQ challenged me to write a regular expression to recognize the following language L:

Linguagem sobre o alfabeto desejado {a,b,c,d}, qualquer palavra que não contenha a subexpressão "bbab"

I was able to assemble, on this premise, the following finite state machine:

Máquina de estados finito para reconhecer a palavra que não contém "bbab"

the Miguel Angelo detected a typo in the DFA originally described. The transition qa ->q0 is in the case [^b], according to the new figure; in the old figure with [^a] was wrong

And also the following regular grammar (tail recursion):

L(G) = L, gramática sobre {a,b,c,d} que gera a L do desafio

I already have this information in hand, but I’m not able to write a regular expression to detect L. So I have the following questions:

  • What is the regular expression (mathematically speaking) you recognize L?
  • Is it true that regex is computationally equivalent to regular grammars? Or regular grammars is a superset of their own?
  • What algorithm can I use to turn my AFD into regex?
  • What algorithm can I use to turn my deterministic regular grammar into regex?
  • @Guilhermelautert AFD: deterministic finite automaton

  • @Guilhermelautert contain bbab must return false, so grammar cannot produce this subpalavra at all. Look at the state machine which, when that word is produced, goes in the state beyond salvation damnation, from which I could not go to a state of acceptance

  • @Guilhermelautert I may have made a mistake in AFD or grammar, but it was my attempt =] I can revise it again, but I don’t think I’ll be able to find my mistake myself if I did

  • 1
  • If the input is infinite there will never be an end, and the only alternative is eventually to fail. If the input is finite, any state at the end of the input must go to the acceptance.

  • @Miguelangelo (aba)* ends in state of acceptance

  • I didn’t understand. I was referring to the state machine... type, any state when reading $ (input order) goes to the accepting state.

  • 1

    In the last part of the machine, in the qa, because you’re going back to the beginning by reading [^a]? Shouldn’t be [^b]?

  • @Miguelangelo yes, in this case, AFD modeling did not use the end of the line $, but if it had that mark the states q0, qb1, qb2 and qa would go to the state of acceptance after reading $

  • @Miguelangelo typo, I will correct the picture as soon as I have access to a computer

  • 2

    I managed to get that expression from your AFD: ^($|[^b]|b($|[^b]|b(b)*($|[^ab]|a($|[^b]))))*$... is the best I could do! = D

  • @Miguelangelo I imagined that it would be really great, I will analyze with affection =]

  • But a general algorithm, I couldn’t say, because I had to use $ to save me. It’s like a Goto das regex, I couldn’t find another way to suddenly end the loop created by the operator *.

  • Beautiful float, that sinister question. + 1 just because I didn’t understand, rs!

Show 9 more comments

1 answer

18


Good let’s see that the Language L accepts following the premise:

inserir a descrição da imagem aqui

  • {a,b,c,d} - so we have that the only characters present in the language are a,b,c d,
  • * - all possible words that can be formed by these characters

But there’s a rule :

  • p does not contains subsequence bbab - Namely the word cannot contain the sequence bbab

Both state machines and regular grammar are correct, however when you mount the REGEX you can keep in mind that just giving match in bbab already makes the sequence invalid.

inserir a descrição da imagem aqui

Using the Jflap I assembled the finite automaton in which we can analyze it better

inserir a descrição da imagem aqui

Regular Grammar

If the language you are using already has the alternatives look-Ahead, a validation by REGEX becomes simple.

(?=^[abcd]*$)(?!^.*bbab.*$).*
  • (?=^[abcd]*$) - will accept only the characters involved
  • (?!^.*bbab.*$) - is in any part of the current sentence bbab will refuse the match.

See more in debug

Now that you’re riding in pure REGEX, which you don’t own look-Ahead it becomes a little complex to assemble the regex, because the process is based on acceptance and not denial or your sentence should start and go beating all possible states until you reach the end of the sentence, if any state is not fulfilled then does not generate match.

Following the REGEX proposal for @Miguel:

^($|[^b]|b($|[^b]|b(b)*($|[^ab]|a($|[^b]))))*$ 

This is based on giving match positive if based on start(^) and end($) of the sentence.

Issues

  • Is it true that regex is computationally equivalent to regular grammars? Or regular grammars is a superset of their own?

In fact they are related, because they can be generated by an AFD (Toutopian Fbeginnings Deterministic), however, as you can see in the image below, they do not have a direct relation.

inserir a descrição da imagem aqui

Source : Regular Expressions and Regular Grammars

If you have an ER or GR and convert to Afε - which is not too complicated since the ε(empty motion) lets you jump from one transition to another without problems - doesn’t mean you’ll be able to complete the other conversions AFε -> AFN -> AFD since at each step you are restricting rules and depending on the complexity it will not be possible to perform the conversion.

In some sources I found it is said that REGEX is based on RLG (Grheumatic Linear a Direita(Right)) or LLG (Grheumatic Linear a Esquerda (LEft)), usually RLG. link
But in fact it is implicit, as these conversions will be possible.

  • What algorithm can I use to turn my AFD into regex?

First of all convert it to an SMR, and after applying this rule :

inserir a descrição da imagem aqui

Original : Steps to Convert regular Expressions directly to regular grammars and vice versa

  • What is the regular expression (mathematically speaking) you recognize L?

I’m gonna owe you.

Addendum

Deterministic and Nondeterministic Finite Automata

  • 1

    With symbols of beginning and end I believe that it is valid mathematically too. For me it met me perfectly

Browser other questions tagged

You are not signed in. Login or sign up in order to post.