4
I have that expression:
(?:[ \t]*[a-z][)]\s*)?([^\r\n<]+(?:(?:\r?\n(?!\s*[a-z][)])|<(?!br\s*\/?>(?:\s*<br\s*\/?>)*\s*(?:\s+[a-z][)]|\s*$)))[^\r\n<]*)*)(?:<br\s*\/?>\s*)*
Which matches this text and removes the letters a),b),c),d),e) and the <brs> only at the end:
<strong>Preencha</strong> a lacuna e assinale a alternativa correta. <br /><br />
I - capacitação técnico-profissional: Comprovação do licitante de possuir em seu quadro permanente, na data prevista para entrega da proposta, ________________, detentor de atestado de responsabilidade técnica por execução de obra ou serviço de características semelhantes, limitadas estas exclusivamente às parcelas de maior relevância e valor significativo do objeto da licitação, vedadas as exigências de quantidades mínimas ou prazos máximos (Lei 8.666/1993 Art N° 30).<br />
<br />
a)<strong>profissional</strong> de nível superior<br />
b)profissional de nível superior ou outro devidamente reconhecido pela entidade competente<br />
c)profissional capacitado<br />
d)profissional de nível minimamente técnico<br />
e)profissional especializado no objeto da <strong>licitação</strong>
Currently she leaves like this:
strong>Preencha</strong> a lacuna e assinale a alternativa correta.<br /><br />
I - capacitação técnico-profissional: Comprovação do licitante de possuir em seu quadro permanente, na data prevista para entrega da proposta, ________________, detentor de atestado de responsabilidade técnica por execução de obra ou serviço de características semelhantes, limitadas estas exclusivamente às parcelas de maior relevância e valor significativo do objeto da licitação, vedadas as exigências de quantidades mínimas ou prazos máximos (Lei 8.666/1993 Art N° 30).
a)<strong>profissional</strong> de nível superior
profissional de nível superior ou outro devidamente reconhecido pela entidade competente
profissional capacitado
profissional de nível minimamente técnico
profissional especializado no objeto da <strong>licitação</strong>
Can be seen here https://regex101.com/r/MDstG4/4
But as seen in this link, when inserting some formatting tag at the beginning of the question, or at the beginning of the answers not home. See the <strong> as cut back, at the beginning of the question and the letter a) that is included in the first answer. It should come clean, like the other answers.
Remembering that question, and each answer I’m picking separately to insert into a field in the database.
The attempt is to take:
- Take everything up to the a) and delete all brsonly at the end.
- Take a),b)... until the next letter deletes all brsonly at the end.
ASP code. To using so, because time is 4 answers, time 5.
questao=Request.Form("editor")
Set re = New RegExp'RegEx
re.Global = true
re.IgnoreCase = true
re.Pattern = "(?:[ \t]*[a-z][)]\s*)?([^\r\n<]+(?:(?:\r?\n(?!\s*[a-z][)])|<(?!br\s*\/?>(?:\s*<br\s*\/?>)*\s*(?:\s+[a-z][)]|\s*$)))[^\r\n<]*)*)(?:<br\s*\/?>\s*)*"    
Set matches = re.Execute(questao)
If (matches.Count) Then
    For m = 1 To matches.Count - 1
    '4 respsotas
    if (matches.Count-1)=4 then
        pergunta=matches(0).SubMatches(0)
        resposta_a=matches(1).SubMatches(0)
        resposta_b=matches(2).SubMatches(0)
        resposta_c=matches(3).SubMatches(0)
        resposta_d=matches(4).SubMatches(0)
    end if
    '5 respostas
    if (matches.Count-1)=5 then
        pergunta=matches(0).SubMatches(0)
        resposta_a=matches(1).SubMatches(0)
        resposta_b=matches(2).SubMatches(0)
        resposta_c=matches(3).SubMatches(0)
        resposta_d=matches(4).SubMatches(0)
        resposta_e=matches(5).SubMatches(0)
    end if
    Next
End If
Set matches = Nothing
Set re = Nothing
My opinion, although not directly related to solving the problem, is that you are solving the problem in the wrong way. If you build this html, then add additional tags to make it easier to capture the elements. Parsing specific html, especially for more elaborate cases like yours, regex rule is not the general way. There are html parsers precisely for this reason.
– Isac
It’s not me, it’s users who understand nothing about it.
– Rod
In fact, I’m almost to the solution, missing only those details that I can not adjust.
– Rod
How about this? https://regex101.com/r/Xoyimh/1
– Marcelo Shiniti Uchimura
@Marcelouchimura seems perfect, but in Count maches in Asp does not work. I will put the code in the question for you see.
– Rod
@Rod This regex satisfies?
(?![a-z]?\)).+(?=<br\s*\/)And the demo. I saw that ASP is very similar to VBA, if necessary I can create an example of VBA... Because I do not know ASP– danieltakeshi
@danieltakeshi looks perfect, I’ll test it. Thanks anyway.
– Rod
I see here that the question picks up two maches. it would have to be a maches, regardless of the line breaks you have.
– Rod
Only the question? The options in letters can be separated?
– danieltakeshi
@danieltakeshi Yes, a Mach for the question, regardless of how many brs you have in the middle, only exclude the ones from the end, and a Mach for each letter, you can also have brs in the middle, but not at the end. Because then I can take separately and record in the bank. Thanks for answering my friend.
– Rod