0
Good afternoon, everyone.
I am developing a regular expression to extract information from a text. I want to get just one paragraph. The pattern of the text I’m extracting is: it always starts with "Process XXX" and "ends with a date".
The regular expression I’m using is as follows:
Processo\s\d{3,3}.*(\n.*)*\d{2,2}\/\d{2,2}\/\d{4,4}
The problem is that it ends in the last block. I would like to get only the full paragraph.
Example of a text to be extracted:
Process 001
Included
on the agenda for 01/03/2016, at 08:30. They are aware
the lawyers to whom the judgment
shall take place at the trial itself in accordance with
of Art 47 of the Internal Regiment of the Class Recursal.
02/03/2015
Process 001
Included in agenda for 01/03/2016, at 08:30. Stay
aware that the judgment
shall take place at the trial itself in accordance with
of Art 47 of the Internal Regiment of the Class Recursal.
02/03/2015
When executing the expression in this text, everything is selected.
Your regex does not work, post the input text and the text and data you want to extract which I will help you, by the time I realized You will have to use the graphical punctuation to extract the message!
Processo\s*\d{3,3}.*?\d{2,2}/\d{2,2}/\d{4,4}.*?\.(.*)?\.
– David Schrammel
David, it’s really not working. Here’s the expression: Process s d{3,3}.(\n.)*\d{2,2}/ d{2,2}/ d{4,4}
– helciodasilva
Post all the text you want to analyze, and the results you need to get, in your text has 2 sentences, you want them separated? anyway, detail more
– David Schrammel