How to download automatically on a site that calls Javascript?

Asked

Viewed 1,767 times

8

I need to make a program that downloads Pdfs from various websites daily and automatically. It is very easy to perform this operation using the C#Webclient command, however, on certain websites it is not possible to find the download URL at all. At the event of click from the download button, the code of the site calls a Javascript and in no time is generated a link, I have tried to make a webrequest containing session cookies in an attempt to download the PDF from the server response (I used Fiddler to identify), but was unsuccessful.

Click "Search Diaries" on left corner of the site.

Using DLL Watin, which is a simulator of web browser, I can simulate the click from the button in the browser, but it is not possible to handle the window "Want to save or open the file" from Internet Explorer.

Is there any method to download sites like this?

1 answer

3


Although it is giving the form Submit by javascript at some point an HTTP request is sent with the request data for the server to return the right file, I suggest you analyze the http request head and see what parameters it sends in the form, I made a test here and comes the Form Data with some parameters with random numbers, I think you can based on this see what is the main parameter for the pdf request

For example in this case on line 5287 of the Common1_2_13.js file, the Submit of a form POST happens.

This case is very complicated to make a Crawler, for who has a lot of automatically generated code and the site requires a session that expires in 30 minutes.

As for your alternative of using a simulated web browser I’ve already used Selenium, if I’m not mistaken it has driver for multiple browsers ( http://docs.seleniumhq.org/docs/03_webdriver.jsp)

  • I am using Selenium, in this simulator it is possible to set the download as automatic, thus skipping the IE download window. Thank you very much for the reply, hug!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.