Which way can I use regular expression to capture just a few link attributes

Asked

Viewed 169 times

5

I need a basic example(s) of how I can make a tiny script to pick up only the contents of(s) attribute href.

I’ll simplify the explanation into two groups, it’s them: To and B

In the Group A we have the respective links started with Slash / preceded by the word - mango. Ja para the Group B we have the same setindo, with detail that the word there associated with the code is khaki other than sleeve.

So stay like this:

Group A

<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>

etc....

Group B

<a href='/caqui?v=1234567890'>B</a>
<a href='/caqui?v=1234567890'>B</a>
<a href='/caqui?v=1234567890'>B</a>

etc....

How to deal with one of these groups bringing forward on the page only the links referring to the code mango? The rest shall not appear.

Heed! - They all don’t have class and neither id, but have in its attribute identical values if compared to the same group belonging to the same segment. I just want mango.

So come on!

I have made the loop to go through the elements on the tag a, it returns me all links existing on the page. See:

var link = document.getElementsByTagName('a');

for (var i = 0; i < link.length; i++) {

    document.getElementById("resultado").innerHTML+= "<br>"+link[i].getAttribute('href')+"<br>"

}

What I ask here is the junction of link[i].getAttribute('href') with a regular expression

function ExtrairID(url){
    var regExp = /^.*((manga\?))\??v?=?([^#\&\?]*).*/;
    var match = url.match(regExp);
    if ( match && match[7].length == 10 ){
        return match[7];
    }else{
        alert("Não foi possível extrair a ID.");
    }
}

I have already made some attempts on my own, then I ended up getting lost so I decided to come here question. Without more, I wait answer or brief explanation.

  • The size is always the same ? 10 digits ?

  • As much as it works with regex, the recommended thing to work with HTML is to use a DOM for Javascript (or other web-based automated). Often you may come across situations that a simple Regex will not be enough.

  • @danieltakeshi I’m sorry but, I don’t understand what you wanted to pass me. Could you kindly be clearer on the subject?

  • What do you mean "bringing forward the page"?

  • You want to get all the href sleeved?

  • @DVD That’s what you got. =)

  • What I wanted to explain, is that Regex is not the most recommended. And the dvd response is more recommended... with .getAttribute("href");

Show 2 more comments

4 answers

3


An option without using regex, just checking if in href has the floor mango:

var els = document.querySelectorAll("a");
var resultado = '';
for(var x=0; x<els.length; x++){
   var href = els[x].getAttribute("href");
   resultado += href.indexOf("manga") != -1 ? href : '';
}
console.log(resultado);
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>
<a href='/caqui?v=1234567891'>B</a>
<a href='/caqui?v=1234567891'>B</a>
<a href='/caqui?v=1234567891'>B</a>

Or you can use the specific selector without using indexOf:

var els = document.querySelectorAll("a[href*='manga']");
var resultado = '';
for(var x=0; x<els.length; x++){
   resultado += els[x].getAttribute("href");
}
console.log(resultado);
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567890'>A</a>
<a href='/caqui?v=1234567891'>B</a>
<a href='/caqui?v=1234567891'>B</a>
<a href='/caqui?v=1234567891'>B</a>

  • Got good d+ was creative in using the method indexOf(). Too bad that didn’t cross my mind.

2

First does the test to check whether in the URL contains /manga

  • \d Digits only
  • {10} Limit of 10 digits

var LINKS = document.querySelectorAll('a');
var RESULTADO = document.querySelector('#resultado');

function extrair(url) {
  // Verifica se contém '/manga'
  if (/\/manga/.test(url)) {
    RESULTADO.innerHTML += /\d{10}/g.exec(url) + "<br>";
  }
}

for (var i = 0; i < LINKS.length; i++) {
  extrair(LINKS[i].getAttribute('href'));
}
<h3>Grupo A e B</h3>
<a href='/manga?v=4533567894'>A</a>
<a href='/caqui?v=1234567490'>B</a>
<a href='/manga?v=7634567890'>A</a>
<a href='/caqui?v=1234567888'>B</a>
<a href='/manga?v=2345567899'>A</a>
<a href='/caqui?v=1234567234'>B</a>

<h3>Resultado da Expressão Regular - Grupo A</h3>
<div id="resultado"></div>

Ecmascript 5

let LINKS = document.querySelectorAll('a');
let RESULTADO = document.querySelector('#resultado');

const extrair = (url) => {
  // Verifica se contém '/manga'
  if (/\/manga/.test(url)) {
    RESULTADO.innerHTML += /\d{10}/g.exec(url) + "<br>";
  }
}

LINKS.forEach((link) => {
  extrair(link.getAttribute('href'));
});
<h3>Grupo A e B</h3>
<a href='/manga?v=4533567894'>A</a>
<a href='/caqui?v=1234567490'>B</a>
<a href='/manga?v=7634567890'>A</a>
<a href='/caqui?v=1234567888'>B</a>
<a href='/manga?v=2345567899'>A</a>
<a href='/caqui?v=1234567234'>B</a>

<h3>Resultado da Expressão Regular - Grupo A</h3>
<div id="resultado"></div>

Obs.: I believe that act more simple, I am still learning, doubts, suggestions, can comment.

  • The error that appears in console, in fact is the Alert that you created on condition if I made this change because it keeps showing up a lot Alert when testing the function. Note that within the function I created has the line console. error only change to be Alert, as for the browser I do not know, because I have never used, at the moment I am by mobile, there is no way to test. But search in Google on the subject.

  • It’s returning as you described it, it’s not returning the Group B only the To.

  • Here tested locally, the Palemoon Browser console output from - LINKS.foreach is not a Function

  • I’ll fix it, the Google Chrome works.

  • It’s these semantics bite too much, for some browsers it works that is a beauty but others not, this without taking into account, different versions. So I like to program in old mode with loop instead of querySelector and forEach. Although Ecmascript strive to standardize, yet this far reach all browsers as occurred with HTML5 and CSS3. That is, I only see these semantics to be useful nowadays in mobile devices that already leave the factory yet to run these syntax, for home PC will only be valid in the near future so much

2

You can easily get all links with a href that starts with /manga with the CSS selector:

a[href^='/manga']

Using Document.querySelectorAll():

links = document.querySelectorAll("a[href^='/manga']");


In addition, all the Htmlanchorelement (as <a>) have the property .search which returns the query string (from the ? onward).

Then use this regex to get the value of the parameter v:

[?&]v=(\d+)


Code:

var links = document.querySelectorAll("a[href^='/manga']"),
    x,
    re = /[?&]v=(\d+)/,
    m;

for (x of links) {
    if (m = re.exec(x.search)) {
        console.log(m[1]);
    }
}
Grupo A e B:
<a href='/manga?v=1234567890'>A</a>
<a href='/manga?v=1234567891'>A</a>
<a href='/caqui?v=1234567892'>B</a>
<a href='/caqui?v=1234567893'>B</a>
<a href='/manga?v=1234567894'>A</a>
<a href='/caqui?v=1234567895'>B</a>

  • 1

    Cool, tai something that did not know new semantics[CSS].

1

In place of match[7] use match[3]

entree: ExtrairID("aaaa/manga?v=1234567890")

function ExtrairID(url){
    var regExp = /^.*((manga\?))\??v?=?([^#\&\?]*).*/;
    var match = url.match(regExp);
    console.log(match);
    if ( match && match[3].length == 10 ){
        return match[3];
    }else{
        alert("Não foi possível extrair a ID.");
    }
}

match result:

["/manga?v=1234567890", "/manga?", "/manga?", "1234567890", index: 0, input: "/manga?v=1234567890"]
0 : "/manga?v=1234567890"
1 : "/manga?"
2 : "/manga?"
3 : "1234567890"
index : 0
input : "/manga?v=1234567890"

Return 1234567890

Edit:

var link = document.getElementsByTagName('a');
for (var i = 0; i < link.length; i++) { 
  ExtrairID(link[i].getAttribute('href')) 
}
  • I edited the loop now Only need to implement the rest

  • you can do everything in one function only but then it’s up to you to decide what you want

Browser other questions tagged

You are not signed in. Login or sign up in order to post.