How to select a text snippet in a String?

Question

How to select a text snippet in a String?

Asked 11 years, 7 months ago

Viewed 6,231 times

12

I’m creating a text file and using some delimiters (<# and #>) I need to select what’s inside the delimiter <# texto.delimitado #>. Using the function split() javascript.

What would be the best regular expression for this? I’ve used Regular Expression (<#|#>), but did not bring the desired result.

4 answers

13

With that expression

/<#(.*?)#>/

You can capture the text between <# and #>.

To get all the match you need to use it as follows:

// cria um objeto RegExp com a flag global
var regex = new RegExp("<#(.*?)#>", "g");

var teste = "<# Meu primeiro teste aqui é # bem esperto #> "
            + "<# Este é meu # segundo # teste #>";

And to run the regex:

var match;
while ((match = regex.exec(teste))) // se chegou ao fim retorna null
{
    console.log(match[1]); // match[1] = o que está entre parenteses
}

Upshot:

Meu primeiro teste aqui é # bem esperto
Este é meu # segundo # teste

2

+1 for simplicity in code and explanation.

– Maximiliano Guerra

2013/12/30 at 13:19

Browser other questions tagged javascript string regex

You are not signed in. Login or sign up in order to post.

by José • **1,549** points · Answer 1 · 2013-12-30T09:17:01+00:00

We can use your regular expression <#|#> without problems. Thus, using the method split(), as requested, the following can be done:

/* Declarações gerais */
var er = new RegExp("<#|#>","g");
var dados_arquivo = new String("<#texto.delimitado.1#><#texto.delimitado.2#>");
var i = new Number();
var resultado = new Array();

/* Obtém os dados que importam */
resultado = dados_arquivo.split(er);

/* Remove os itens não desejados (criados pelo método split) */
for(i = 0; i < resultado.length; i++)
{   
    if(resultado[i] == "")
    {
        resultado.splice(i,1);
    }
}

The result is a array (vector) with the values "delimited text. 1" and "delimited text. 2".

At the end of the code, there is a for which serves to remove empty items from array created by the split. Explaining:

The split() takes everything that "home" (match) and throws away and, what not "house", it returns as a array. However, how the split takes everything left and right of what "married" (but who was not married), where there is nothing he simply takes this "nothing" and puts as another item of the array resultant.

It is worth noting that the case of text not between "<#" and "#>" (in this order): the portion of text that is not among "<#" and "#>" is seen as bordering them (as explained above), even if it is not among the bounders themselves. This is because the ER used does not see these delimiters as a unit, but as two distinct separators because they are separated by "or" (|). Example:

change the code above with

var dados = new String("a<#texto.delimitado.1#>b<#texto.delimitado.2#>c");

the final result will be 5 items: "a", "text.delimited. 1", "b", "delimited text. 2" and "c"

Thus, it is important that, if this occurs, use an algorithm that removes first these unwanted text data. If this is the case, you can use the code below:

/* Declarações gerais */
var er = new RegExp("<#|#>","g");
var dados_arquivo = new String("a<#texto.delimitado.1#>b<#texto.delimitado.2#>c");
var i = new Number();
var resultado = new Array();

/* Algorítimo auxiliar // INÍCIO */
var er_auxiliar = new RegExp("<#.*?#>","g");
var texto_delimitado = dados_arquivo.match(er_auxiliar);

while(texto_delimitado.length > 1)
{   
    texto_delimitado[0] = texto_delimitado[0] + texto_delimitado[1];
    texto_delimitado.splice(1,1);
}
/* Algoritmo auxiliar // FIM */

/* Obtém dados que importam */
resultado = texto_delimitado[0].split(er); /* <- Foi trocada a variável */

/* Remove os itens não desejados (criados pelo método split) */
for(i = 0; i < resultado.length; i++)
{       
    if(resultado[i] === "")
    {
        resultado.splice(i,1);
    }
}

The novelty (algorithm added) has been marked in the code. Changes have been made to the variables name to conform to the new code.

What the added algorithm does is as follows: it searches the data obtained from the original file (with the delimiters) and gets everything that is between "<#" and "#>" (by means of an auxiliary ER for the method match(). The result would be a array. But what’s in the while is precisely a way of uniting the entire result obtained as if it were a single string so that the algorithm (which already had itself) can separate everything with its ER.

That’s it; I hope I’ve helped!

by Sam • **79,597** points · Answer 2 · 2018-06-25T01:32:31+00:00

2

Another way using filter and map:

var string = "bla bla <# texto.delimitado #> bla bla bla<# texto.delimitado2#>";
var resultado = string.split(/<#/).filter(function(v){
   return ~v.indexOf("#>");
}).map(function(v){
   return v.match(/(.*)#>/)[1].trim();
})
console.log(resultado);

1

I had difficulty following the expression in the filter. This ~v.indexOf, is to ensure that, if -1, deny all bits and become false?

– Jefferson Quesado

2018/06/25 at 02:25
1

Dude, I use it as a short form of indexOf() != -1

– Sam

2018/06/25 at 02:28
2

got it now. That looks like code-golf, pretty minimal. I’m not even a little used to it. Thanks for clarifying =D

– Jefferson Quesado

2018/06/25 at 02:33

by Denir Roberto Tavares • 1 point · Answer 3 · 2016-12-21T19:12:34+00:00

In a simpler way

var regex = /\[(.*?)\]/g;
var texto = '[Palavra chave 1 = 296] Se refere ao item do produto para direcionamento. [Palavra chave 2 = 1234]'
alert(texto.match(regex));