How to get the html of the page after loading javascript, using Guzzlehttp

Question

How to get the html of the page after loading javascript, using Guzzlehttp

Asked 6 years, 9 months ago

Viewed 87 times

0

Good morning,

I’m creating a Crawler for you to access a specific page and then take some specific data from the page, but I’m having problems.

Right now I’m trying to perform a test on instagram, my code is as follows:

$client = new Client();

$request = $client->request('GET', 'https://www.instagram.com/user/');

return response()->json( $request->getBody() );

However, the moment I print getBody is returning empty {}, I also tried to add a second parameter to get the data, as follows:

return response()->json( $request->getBody()->getContents() );

By using getContents you are returning me little html and the rest of javascript, because of this I believe the error may be in the way I am calling.

1 answer

Browser other questions tagged php laravel guzzle

You are not signed in. Login or sign up in order to post.

by Wallace Maxters • **102,340** points · Answer 1 · 2018-10-25T16:06:55+00:00

When you use the Guzzle, the value returned by getBody is an object.

That object is the GuzzleHttp\Stream\Stream.

By default the class json_encode usually brings confusing results when you try to serialize an object that does not implement the magic interface JsonSerialize.

But the class GuzzleHttp\Stream\Stream implements the method __toString, which allows a special behavior for the class as the same is treated as string.

What can be done?

Make a cast of the result to see the value returned from your request, thus:

$client = new Client();

// Pequena correção, 'request' é requisição, 'response' é resposta

$response = $client->request('GET', 'https://www.instagram.com/user/');

var_dump((string) $response->getBody());

It is also important to remember that the function response()->json() Laravel’s purpose is to serialize values to the JSON format. Probably your request will be HTML.

Be sure of what you will do with the return of Guzzle.

Depending on the situation, for you, it would be necessary to do just that:

 return response((string) $response->getBody());

By using getContents you are returning me little html and the rest of javascript, because of this I believe the error may be in the way I am calling.

I don’t know what you’re trying to do, but if you want Instagram to give you back one JSON, you might need to access the Instagram API.

Take a look at her here