How to extract the attribute from within A HREF with Delphi?

Asked

Viewed 76 times

0

Guys I researched a lot and could not solve this problem. Here’s the thing, I can extract the text information, for example:

<span class="style13">texto para extrair</span>

But what you want is to extract the link from A HREF, example:

<a href="extrair o link" target="_blank">texto</a>

With this code I get a lot of information:

procedure TForm1.Button1Click(Sender: TObject);
var
  doc : IHTMLDocument3;
  elements : IHTMLElementCollection;
  element : IHTMLElement;
  i : integer;
begin
    doc := WebBrowser1.Document as IHTMLDocument3;

    elements := doc.getElementsByTagName('a');

    for i := 0 to elements.length - 1 do begin
        element := elements.item(i, 0) as IHTMLElement;
        Memo1.Lines.Add(element.innerHTML);
    end;
end;

I would like a more specific answer, if possible just extract the same link.

Thanks in advance.

2 answers

2


On the one hand you can use Pos, Leftstr and Rightstr to cut the string and extract the info you want.

On the other, more versatile, you can use Regular Expressions to extract what you need. I did a small job to make it easier

function Webpage_ReadValue(AText:string;AStart,AEnd:string):string;
var
  MatchVar:TRegEx;
  MatchInfo:TMatch;
  value:Variant;
  RegExpress:string;
begin
  result:='';
  RegExpress:='(?i)'+AStart+'[\s\S]+?'+AEnd;
  if RegExpress<>'' then
  begin
    try
      MatchVar:=TRegEx.Create(RegExpress);
      MatchInfo:=MatchVar.Match(AText);
      if MatchInfo.Value<>null then
        result:=MatchInfo.Value;
    finally

    end;
  end;
end;

Note that in your case may need changes, namely to replace the Astart and the Aend by '' at the end...

  • It’s had no way to extract, if it exists as I do not know how to do haha Thanks for the help, I got to where I wanted using your idea, look down as I did.

  • OK ;) If you’ve solved it’s fine

0

Solved, I did it that way:

procedure TForm1.Button1Click(Sender: TObject);
var
  doc : IHTMLDocument3;
  elements : IHTMLElementCollection;
  element : IHTMLElement;
  str : string;
  i : integer;
begin
  Memo1.Clear;
  doc := WebBrowser1.Document as IHTMLDocument3;

  elements := doc.getElementsByTagName(Edit3.Text);

  for i := 0 to elements.length - 1 do begin

    element := (elements.item(i, 0)) as IHTMLElement;

    str := element.innerhtml;
    str := Copy(str, Pos('href', str)+6, Length(str));

    if str.Contains('http') then begin

      str := Copy(str, 0, Pos('>', str)-2);
      Memo1.Lines.Add(str);

    end;

  end;
end;

Browser other questions tagged

You are not signed in. Login or sign up in order to post.