3
I’m trying to add attributes to a tag tag <a>
from a parse of Markdown
(markdown => html).
In my document markdown
i add parentheses and the markup I want right after declaring the links, for example:
[Cool Text](https://hiperlynck "title")(class="ext-link-icon" data-super="..." foo="bar")
The parser does his job properly and returns me only the substitution of the marking that he recognizes:
<a href="https://hiperlynk" title="title">Cool Text</a>(class="ext-link-icon" data-super="..." foo="bar")
From this point on I have to find what I added between parentheses at the end of the markup markdown
and add them detro from the opening tag <a>
. I’m using the following RegEx
: /(<a.+<\/a>)\((.+=".+" ?)+\)/g
The code below is what I have for now:
let regex = /(<a.+<\/a>)\((.+=".+" ?)+\)/g
let str = '<a href="https://hiperlynk" title="title">Cool Text</a>(class="ext-link-icon" data-super="..." foo="bar")'.replace(regex, (match, $1, $2) => {
if ( !$1 && !$2 ) {
let url = match.match(/"(.*?)"/)[1]
// checar se é link local ou para o mesmo hostname
if ( url.includes(window.location.hostname) || url[0] == '/' || url[0] == '.' || url[0] == '#' ) {
// caso seja link local, retorna
return match
}
// aqui assume não ser um link local e adiciona atributos
let allHrefContent = match.match(/^<a (.*?)>/)[1];
if ( !allHrefContent.includes('target="') ) {
allHrefContent += ' target="about:blank"'
}
allHrefContent += ' rel="noopener noreferrer"'
return `<a ${allHrefContent}>${match.match(/>(.*?)</)[1]}</a>`
} else {
// aqui a segunda ocorrência é tudo aquilo que foi adicionado entre parenteses do `markdown` após o link
if ( /^(rel=")/.test($2) ) {
let rel = $2.replace(/rel="|"/g, '');
if ( !rel.includes('noopener') ) {
rel += ' noopener'
}
if ( !rel.includes('noreferrer') ) {
rel += ' noreferrer'
}
return $1.replace('">', `" target="about:blank" rel="${rel}" ${$2}>`)
} else {
return $1.replace('">', `" target="about:blank" rel="noopener noreferrer" ${$2}>`)
}
}
})
document.body.innerHTML = str
console.log(str)
.ext-link-icon {
background: url("data:image/svg+xml;charset=UTF-8,%3Csvg xmlns='http://www.w3.org/2000/svg' width='24' height='24' viewBox='0 0 24 24' fill='rgb(51, 103, 214)'%3E%3Cpath d='M19 19H5V5h7V3H5c-1.11 0-2 .9-2 2v14c0 1.1.89 2 2 2h14c1.1 0 2-.9 2-2v-7h-2v7zM14 3v2h3.59l-9.83 9.83 1.41 1.41L19 6.41V10h2V3h-7z'/%3E%3C/svg%3E") right/12px no-repeat;
padding-right: 0.875em;
}
It works well with just one link but, if there is more than one break and I am not able to formulate the logic to group the occurrences.
Example with more than one link:
let regex = /(<a.+<\/a>)\((.+=".+" ?)+\)/g
let str = '<a href="../">Voltar</a>(class="patinho-feio") qualquer coisa aqui <a href="https://hiperlynk" title="title">Cool Text</a>(class="ext-link-icon" data-super="..." foo="bar")'.replace(regex, (match, $1, $2) => {
if ( !$1 && !$2 ) {
let url = match.match(/"(.*?)"/)[1]
// checar se é link local ou para o mesmo hostname
if ( url.includes(window.location.hostname) || url[0] == '/' || url[0] == '.' || url[0] == '#' ) {
// caso seja link local, retorna
return match
}
// aqui assume não ser um link local e adiciona atributos
let allHrefContent = match.match(/^<a (.*?)>/)[1];
if ( !allHrefContent.includes('target="') ) {
allHrefContent += ' target="about:blank"'
}
allHrefContent += ' rel="noopener noreferrer"'
return `<a ${allHrefContent}>${match.match(/>(.*?)</)[1]}</a>`
} else {
// aqui a segunda ocorrência é tudo aquilo que foi adicionado entre parenteses do `markdown` após o link
if ( /^(rel=")/.test($2) ) {
let rel = $2.replace(/rel="|"/g, '');
if ( !rel.includes('noopener') ) {
rel += ' noopener'
}
if ( !rel.includes('noreferrer') ) {
rel += ' noreferrer'
}
return $1.replace('">', `" target="about:blank" rel="${rel}" ${$2}>`)
} else {
return $1.replace('">', `" target="about:blank" rel="noopener noreferrer" ${$2}>`)
}
}
})
document.body.innerHTML = str
console.log(str)
.ext-link-icon {
background: url("data:image/svg+xml;charset=UTF-8,%3Csvg xmlns='http://www.w3.org/2000/svg' width='24' height='24' viewBox='0 0 24 24' fill='rgb(51, 103, 214)'%3E%3Cpath d='M19 19H5V5h7V3H5c-1.11 0-2 .9-2 2v14c0 1.1.89 2 2 2h14c1.1 0 2-.9 2-2v-7h-2v7zM14 3v2h3.59l-9.83 9.83 1.41 1.41L19 6.41V10h2V3h-7z'/%3E%3C/svg%3E") right/12px no-repeat;
padding-right: 0.875em;
}
I confess that RegEx
is not my thing ... so the brief question is: how can I capture two or more groups by following this parameter?
<a>text</a>(class="foo") qualquer coisa aqui <a>text</a>(class="bar")
So you can arrive at the expected result:
<a class="foo">text</a> qualquer coisa aqui <a class="bar">text</a>
From already grateful for any help lead me to understand the problem.
Using DOM and using nextSibling will probably take the following text from the specified elements and with removeChild you can remove them, and having the value of #textnode will be able to easily convert into attribute for the previous element. In short, if I understand your regex doubt seems totally expendable for this case.
– Guilherme Nascimento
@Guilhermenascimento I didn’t quite understand your suggestion. I’m manipulating a string before accommodating the DOM...pq would put in the DOM to then remove it? Sorry if I got it wrong.
– Lauro Moraes
You want to convert the texts
(class="foo")
for the preceding elements, right? And these texts after the class attribute added to the element, or got it wrong?– Guilherme Nascimento
I want to take what’s in square brackets
()
in the caseclass="foo"
and add inside the tag<a>
before accommodating them to DOM ... when releasing to DOM they will already be with the attributes. I can reach this result with a link onstring
bad, not with 2 or more– Lauro Moraes
Well, it looks exactly like what I said :) ... so could use
domparsed = DOMParser.parseFromString(sua string aqui, "text/html")
to treat your text before adding as DOM on the page and then selects all A elements withvar links = domparsed.getElementsByTagName('a')
and then with a for goes seeing item by item of links and each one of them you check thenextSibling
, with this property you get the value of the text(class="nome da classe")
– Guilherme Nascimento
Do not parse HTML with regex. Regular expressions are not a sufficiently sophisticated tool to understand the constructs employed by HTML. See Analyzing Cthulhu-like Html
– Augusto Vasques
@Augustovasques didn’t have time to "optimize", but I roughly left something ready https://answall.com/a/495414/3635 ... I hope I understood the question.
– Guilherme Nascimento