3
Well, I need to access a site on my work network, but this is protected by proxy.
Some sites accept using httr and rvest packages, others do not. To log in to site for examples I cannot. Example:
pro <- use_proxy("minha.proxy", porta, "meuusuario", "minhasenha")
my_session <- html_session(url, pro)
I usually use this proxy function to access the url I want and go through the proxy.
But in certain sites, in case to log in, this function does not run, or better I can not log in.
The alternative I found was to use a remote driver using the function rsDriver(browser=c("chrome"))
, for example. On my personal pc I can unwind all the code through the remote driver of the Rselenium Package.
Now I can’t work the network.
The best options I found researching were:
1)
cprof <- list(chromeOptions = list(
args = c('--proxy-server=http://minha.proxy:porta',
'--proxy-auth=usuario:senha')))
driver<- rsDriver(browser=c("chrome"), extraCapabilities = cprof)
2)
cprof <- list(chromeOptions = list(
args = c('--proxy-server=http://ip:porta',
'--proxy-auth=usuario:senha')))
driver<- rsDriver(browser=c("chrome"), extraCapabilities = cprof)
This to pass the proxy, but returns in all:
checking Selenium Server versions:
BEGIN: PREDOWNLOAD
Error in open.connection(con, "rb") :
Timeout was reached: Connection timed out after 10000 milliseconds
This error is what usually happens when you don’t pass the proxy (I think!).
So, is there any way to bypass the proxy and open my remote driver? Well, if you have anything to contribute, I’d be grateful!
already tried to do with another browser?
– Daniel Falbel
Sim Daniel,
firefox
andphantomjs
too. But you have the same mistake.– Pablo Dias Vieira
Have you tried something like this: https://stackoverflow.com/a/29663818/3297472
– Daniel Falbel
This kind of thing is mto hard to debug. I don’t have any proxy server to test!
– Daniel Falbel
I tried too, the problem is that the function
phantom
it doesn’t seem to work anymoreError: phantom is now defunct. Users can drive PhantomJS via selenium using 
 the RSelenium::rsDriver function or directly using wdman::phantomjs
. I tried to make the switch to wdman::phantomjs, but I was unsuccessful because I didn’t fully understand it.– Pablo Dias Vieira
I agree that it is very difficult, I think that each proxy server has its particularities which makes it even more difficult.
– Pablo Dias Vieira
Try using the library Node
puppeteer.js
with theR
, maybe you can with this API. I wrote something like starting to use thepuppeteer.js
here. API ofpuppeteer.js
here.– JdeMello
Thanks @Jdemello. With this method I managed, but before running it I had to make a change in the environment variables. It would be prudent to put the solution I found in answer to my question?
– Pablo Dias Vieira
Yes do, I find it a pertinent question. Thank you
– JdeMello
Can you use Docker on the work server? because I think the containerized version of Selenium would solve the problem.
– José