The problem is that after "frame" and the number, has a –
. But in your regex you used [\w\s]
, whereas the \w
is shortcut representing an alpha-numeric character (a letter, number or _
) and the \s
corresponds to spaces and line breaks. None of them match the –
, so she can’t find a match.
If the idea is to take "anything", including line breaks, an alternative is:
fs.readFile("return.txt", "utf-8", (err, data) => {
if (err) console.log(err);
const boardContent = data.match(/^quadro \d+[\s\S]*?^fonte:.*$/gim);
console.log(boardContent);
});
How did you use the flag m
, then the markers ^
and $
, which usually indicate only the beginning and end of the string, they also indicate the beginning and end of a line. I did so to ensure that I get the lines that start with "frame" and "source".
Among them I use [\s\S]
, which is basically the \s
(spaces and line breaks) and \S
(everything that is not \s
). That is, it takes any type of character. The quantifier *?
ensures that I will pick up as few characters as possible, so it stops when I find a line that starts with "source" (about the behavior of *?
, has more information here, here and here).
But you said you’d get the number from the board and just extract the contents from this one. Then you can extract the regex number and only add in the results if it is the number you want. For example, if I just want the 30 frame:
fs.readFile("return.txt", "utf-8", (err, data) => {
if (err) console.log(err);
const boardContent = [];
for (const match of data.matchAll(/^quadro (\d+)[\s\S]*?^fonte:.*$/gim)) {
let numeroQuadro = parseInt(match[1]);
if (numeroQuadro == 30) { // só quero o quadro 30 (aqui você coloca a condição que quiser)
boardContent.push(match[0]);
}
}
console.log(boardContent);
});
Now the \d+
is in brackets to form a capture group. With this I can get the contents of it with match[1]
. If it’s the number I want, add it to the results (using match[0]
, which will contain all the string that was captured by regex).
But of course you can also do without regex. Since you implied that the file is large, it might be better to read it one line at a time, instead of loading it all at once into memory:
- if the line starts with "Frame [frame number]", you start a record
- concatenating until you find a line that starts with "Source:"
Sort of like this:
const fs = require('fs');
const readline = require('readline');
var lineReader = readline.createInterface({
input: fs.createReadStream('return.txt', { encoding: 'utf-8' })
});
var contents = [];
var current = '';
var numeroQuadro = 30;
lineReader.on('line', function (line) {
if (line.startsWith(`Quadro ${numeroQuadro}–`)) {
current = line; // iniciou o conteúdo do quadro
} else if (current) { // se está no meio do conteúdo do quadro
if (line.startsWith('Fonte:')) { // verifica se terminou
// se terminou, adiciona no array de resultados e zera o conteúdo
contents.push(`${current}\n${line}`);
current = '';
} else { // se não terminou, só adiciona ao conteúdo atual
current += `\n${line}`;
}
}
});
// depois que leu tudo, imprime o conteúdo encontrado
lineReader.on('close', function () {
console.log('Quadro encontrado: ', contents);
});