Web automation with Selenium or other tools without opening the nagevador?

Asked

Viewed 427 times

0

Hello, I use Selenium in Nodejs to log in to a site and then get all HTML code generated, because in this HTML has a table and precise turns it into a json.

Do an automation open a browser to harvest this information know that this requires a lot of processing given that it would like to make an API, leaves a VPS opening several browsers to collect information not create that would work.


I wonder if there is another way to do this, without having to open the browser like Selenium does to automate?

  • Describing the general problem you will get only a general answer. Example: "How to build a house?" , "Use blocks and cement, build solid walls". Instead ask something specific and responsive in a useful way: "How to lift a wall using this type of block with such a slope and such height safely?" , answer: "Position the blocks in such format, run this block placement algorithm, do not use this tool because there is such a risk, here is an example running from a wall ready for you to see how it does [link]". See? Too many wide questions don’t help.

  • Has any response helped solve the problem and can address similar questions from other users? If so, make sure to mark the answer as accepted. To do this just click on the left side of it (below the indicator of up and down votes).

2 answers

1

I believe you can configure the browser to run without the interface in mode Headless, as in this example:

const chrome = require('selenium-webdriver/chrome');
const firefox = require('selenium-webdriver/firefox');
const {Builder, By, Key, until} = require('selenium-webdriver');

const screen = {
  width: 640,
  height: 480
};

let driver = new Builder()
    .forBrowser('chrome')
    .setChromeOptions(new chrome.Options().headless().windowSize(screen))
    .setFirefoxOptions(new firefox.Options().headless().windowSize(screen))
    .build();

Or use a Headless browser like Phantomjs.

  • 1

    Perfect, he said, based on his answer I found two site that helps me a lot...http://www.codeatest.com/chrome-headless-selenium-webdriver/ and https://gist.github.com/anandsunderraman/e351485319a8a0e7df7e

0


There is a library of Nodecalling for Puppeteer for automation purposes.


Puppeteer

Most Things that you can do Manually in the browser can be done using Puppeteer! Here are a few examples to get you Started:

  • Generate screenshots and Pdfs of pages.
  • Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).
  • Automate form Submission, UI testing, Keyboard input, etc.
  • Create an up-to-date, Automated testing Environment. Run your tests directly in the Latest version of Chrome using the Latest Javascript and browser Features.
  • Capture a Timeline trace of your site to help diagnose performance issues.
  • Test Chrome Extensions.

In free translation:

Most of the things you can do manually in a browser can be done using Puppeteer! Here are a few examples for you to start:

  • Generate screenshots and Pdfs of pages.
  • Browse a SPA (Single-Page Application and generate pre-rendered content (i.e "SSR" (Server-Side Rederization)).
  • Automate form submission, interface testing, keyboard input, etc.
  • Capture your website’s timeline to help diagnose performance issues.
  • Testing extensions from Chrome.

There is also the possibility to use requests by Node and use a parser like the cheerio to browse the result, bearing in mind that in this way the site will not be interpreted as in a browser.


Cheerio is not a web browser

Cheerio parses Markup and provides an API for traversing/manipulating the Resulting data Structure. It does not interpret the result as a web browser does. Specifically, it does not Produce a visual Rendering, apply CSS, load External Resources, or run Javascript.

In free translation:

Cheerio is not a web browser

Cheerio analyzes the markup and provides an API to traverse / manipulate the resulting data structure. It does not interpret the result as a web browser does. Specifically, it does not produce a visual rendering, apply CSS, load external resources, or run Javascript.


Browser other questions tagged

You are not signed in. Login or sign up in order to post.