How to get HTML code from a page with PHP?

Asked

Viewed 2,784 times

4

I’m in a project where I need to get the HTML of a particular page on the web in order to check the HTML content of that page. Could what path I can follow to do such action in PHP?

I’m currently using the file_get_contents() but when I give echo in the content it returns me the already formatted content. I want to read exactly the source code.

  • 5

    The file_get_contents is giving you the source, it only appears formatted because you are giving echo.

  • It’s like @bfavaretto says.

1 answer

11


Hello! In fact the file_get_contents() does exactly what you want!

From the PHP manual: file_get_contents() - Reads all contents of a file to a string.

That is to say: file_get_contents() does not interpret the file, just reads back and forth exactly what is written on it, without processing anything; returning exactly the contents of the file when read.

The problem: when you perform the echo in the rescued code, it ends up anyway being printed on the page, that is, directly in the HTML markup of the document and so the browser will interpret the source code you got, in case it will read the tags.

The solution: in fact you’re doing everything right, soon, no need for solution. If you want to read the source code on the page, you have to use printable characters, ie, exchange markup code "< > &" for "&lt; &gt; &amp;". Luckily PHP does this with a native function htmlentities().

About the htmlentities():

<?php
$str = "<© W3Sçh°°¦§>";
echo htmlentities($str);
?>

The HTML printed on the page will be:

<!DOCTYPE html>
<html>
<body>
&lt;&copy; W3S&ccedil;h&deg;&deg;&brvbar;&sect;&gt;
</body>
</html>

In the browser, it will be interpreted and displayed on the page as:

<© W3Sçh°°¦§>

More about the htmlentities():

  • I’m not sure that’s what he wants...

  • "need to get the HTML of a particular web page", file_get_contents does just that, it should only be fumbling with the fact that when trying to debug the function result, does not see any HTML, but with this explanation I think he will be able to understand.

  • 1

    @Jorgeb. If on the other hand you think it sounds like a good canonical question to answer "use Curl etc etc etc etc", I thought about it. But hit the crossbar...

  • @bfavaretto I’ve been here testing since then and I haven’t gotten anywhere either ;) I think the best answer is really Machado’s.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.