How can I automatically download PDF from a web page

Asked

Viewed 2,549 times

-2

Is there any way to download a PDF embed on a web page? I am currently using the control WebBrowser.ShowSaveAsDialog() (Save As) but need to do without it, automatically using C# Windows Forms

My code:

private void button1_Click_1(object sender, EventArgs e)
{
    browserPlus1.Navigate("https://www3.webiss.com.br/aracajuse/FormRelNFSe.aspx?tipo=emitirrelatorio&MostrarRel=false&idRec=verificarnfse&IdNotaEletronica=17926183&Expiration=10032016055357&Verificador=566");
    browserPlus1.ShowSaveAsDialog();

}

Only in my case does not have the extension in the URL address

  • What do you call "PDF embebed"? Try to show your situation better.

  • PDF embed is a pdf file that opens in the browser and does not have the PDF extension, usually in aspx pages, unfortunately I can not post the link because it seems that is not allowed

  • Show an example. Show what you did.

  • is in the opening body of this Thread as I am doing: "Webbrowser.Showsaveasdialog();"

  • http://meta.pt.stackoverflow.com/a/1911/101

4 answers

2


I got it the way below using DDL IMPORT

 /// <summary>
    /// The URLMON library contains this function, URLDownloadToFile, which is a way
    /// to download files without user prompts.  The ExecWB( _SAVEAS ) function always
    /// prompts the user, even if _DONTPROMPTUSER parameter is specified, for "internet
    /// security reasons".  This function gets around those reasons.
    /// </summary>
    /// <param name="pCaller">Pointer to caller object (AX).</param>
    /// <param name="szURL">String of the URL.</param>
    /// <param name="szFileName">String of the destination filename/path.</param>
    /// <param name="dwReserved">[reserved].</param>
    /// <param name="lpfnCB">A callback function to monitor progress or abort.</param>
    /// <returns>0 for okay.</returns>
    [DllImport("urlmon.dll", CharSet = CharSet.Auto, PreserveSig = false)]
    private static extern void URLDownloadToFile(
        [MarshalAs(UnmanagedType.IUnknown)] object pCaller,
        [MarshalAs(UnmanagedType.LPTStr)] string szURL,
        [MarshalAs(UnmanagedType.LPTStr)] string szFileName,
        Int32 dwReserved,
        IntPtr lpfnCB);

URLDownloadToFile(null, cidade_municipio + @"FormRelNFSe.aspx?tipo=emitirrelatorio&MostrarRel=false&idRec=verificarnfse&IdNotaEletronica=" + nfs + "&Expiration=23032015031453&Verificador=" + j, txtSalvar.Text +"\\"+nfs.ToString() + "_" + j.ToString() + ".pdf", 0, IntPtr.Zero);

2

  • I’ve used it and it doesn’t work it downloads html,I’ve used Memorystream, Webclient and neither of the two have any effect

  • 1

    So the example you put up is not the way you’re using it. If you’ve tried it in other ways, you should ask the question. Put there what you really did. Show your code. Really show the file situation.

  • I have already made explicit the code and url I use for testing

  • Only remembering if you are going to test you have to mark Public Access and put the captcha

  • 2

    According to comments on the other reply there is a CAPTCHA to be filled before the download. There is no way, since CAPTCHA is used precisely to prevent a program from downloading the contents of the PDF.

  • 2

    @Renan seems to be right, question XY.

  • http://meta.pt.stackoverflow.com/questions/499/o-que-%C3%A9-o-problema-xy

Show 2 more comments

0

Just copy and paste the code below and see if it helps!

WebClient webclient = new WebClient();
webclient.DownloadFile(url_link, "C:meudocumentoem.pdf");

-2

In accordance with Shadow Wizard’s answer, Stackoverflow in English, assuming the server sends the header content-disposition:

using (WebClient client = new WebClient())
{
    using (Stream rawStream = client.OpenRead(url))
    {
        string fileName = string.Empty;
        string contentDisposition = client.ResponseHeaders["content-disposition"];
        if (!string.IsNullOrEmpty(contentDisposition))
        {
            string lookFor = "filename=";
            int index = contentDisposition.IndexOf(lookFor, StringComparison.CurrentCultureIgnoreCase);
            if (index >= 0)
                fileName = contentDisposition.Substring(index + lookFor.Length);
        }
        if (fileName.Length > 0)
        {
            using (StreamReader reader = new StreamReader(rawStream))
            {
                File.WriteAllText(Server.MapPath(fileName), reader.ReadToEnd());
                reader.Close();
            }
        }
        rawStream.Close();
    }
}

If you need to explain the answer here, comment that I edit my.

  • I’ve used this also,it didn’t work it behaves like Webclient and only brings html and not pdf

  • Is it because of the captcha you have before?

  • So there is Captcha before? If there is, at the time of download, the page is that of Captcha or is a blank page?

  • 1

    From what I’ve noticed, there really is Captcha, there it gets MUCH more complex the situation.

  • The page that comes is the captcha itself, has some way to manipulate Webbrowser.Showsaveasdialog() to choose a location or leave automatic?

  • Then there is no possibility, the Captcha was made for this purpose.

  • Good staff thank you all, but unfortunately due to the captcha I have to access before there is the possibility to automate without using Webbrowser.Showsaveasdialog().

Show 2 more comments

Browser other questions tagged

You are not signed in. Login or sign up in order to post.