2
What is the best way to create a robot in PHP?
The goal from robot, is to access a URL, with login and password, enter data in certain fields, submit this data, and interpret the result on screen.
Any suggestions ?
2
What is the best way to create a robot in PHP?
The goal from robot, is to access a URL, with login and password, enter data in certain fields, submit this data, and interpret the result on screen.
Any suggestions ?
4
Robots to search and interpret information on other pages are also called web crawlers or Spiders.
These are scripts that perform the following process:
The process in steps 1 to 3 is easily solved as follows:
$url = 'www.exemplo.com';
$dom = new DOMDocument('1.0');
$dom->loadHTMLFile($url);
This way you will get an object that will allow you to navigate throughout the HTML the way you need it.
For example, to take all links on a page and display addresses would look like this:
$anchors = $dom->getElementsByTagName('a');
foreach ($anchors as $element) {
$href = $element->getAttribute('href');
echo $href . '<br>';
}
An interesting class that can assist in handling HTML and avoid thousands of lines of code is the Simple HTML DOM, and a tutorial teaching how to use can be found on the site Make Use Of.
To fill in a simulation that a form has been completed, simply make a request for the URL that the form points to using the expected request method, that is, request the URL present in the attribute action
using the request method in the attribute method
.
To simulate the situation we will change the previous request code to:
$curl = curl_init();
// Set some options - we are passing in a useragent too here
curl_setopt_array($curl, array(
// Retorna o conteúdo como string
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_URL => 'http://www.exemplo.com',
// Nome de identificação do seu robô
CURLOPT_USERAGENT => 'Nome do seu crawler',
// Indica que a requisição utiliza o método POST
CURLOPT_POST => 1,
// Parâmetros que serão passados via POST
CURLOPT_POSTFIELDS => array(
item1 => 'value',
item2 => 'value2'
)
));
// Fazendo a requisiçnao e salvando na variavel $response
$response = curl_exec($curl);
// Finalizando o objeto de requisição
curl_close($curl);
$dom = new DOMDocument('1.0');
// Realiza o parser da String de retorno da requisição
// Observe que o método mudou de loadHTMLFile para loadHTML
$dom->loadHTML($response);
I managed to create the robot, using Selenium Server + Webdriver, it is quite complete, allows you to manipulate all the resources of a browser.
Browser other questions tagged php
You are not signed in. Login or sign up in order to post.
Take a look here, I think you’ll find everything you need. https://php.net/manual/en/book.curl.php
– claudsan
The best language for this is C# Windows Form(or Any other .NET) because you can use the Webbrowser object to navigate, doing this in PHP would be a terror.
– Tuyoshi Vinicius
You’d like to build a web Crawler?
– rray