从webclient或webbrowser(.NET)下载会话

时间:2017-05-16 21:15:02

标签: c# vb.net pdf

我遇到有关从此页面下载pdf文件的问题,例如:

https://publicaccess.solihull.gov.uk/online-applications/applicationDetails.do?activeTab=documents&keyVal=OPQ691OEKGD00

您可以在该页面中看到一个&#34;申请表&#34;它的最后一列有一个pdf链接。我已经可以使用HtmlAgilityPack解析pdf的链接,但问题是当我为pdf链接执行此操作时 WebBrowser1.Navigate(docUrl) While (WebBrowser1.ReadyState <> WebBrowserReadyState.Complete) System.Windows.Forms.Application.DoEvents() End While Dim client As New WebClient
client.Headers.Add(HttpRequestHeader.Cookie, WebBrowser1.Document.Cookie) client.DownloadFile(New Uri(pdfLink), "appForm.pdf")
它只返回404.虽然pdf链接没有改变,但这可能是一个基于会话的页面。至于标题,我看到WebBrowser1.Document.Cookie即使刚完成加载的页面也返回null。我有什么办法吗?

顺便说一下,这是pdf链接。您可以尝试直接打开它而不单击页面以查看问题

https://publicaccess.solihull.gov.uk/online-applications/files/B30089CDFBE1BBCC3D0E7A598DEFEA61/pdf/PL_2017_01205_PPRM-APPLICATION_FORM_NO_PERSONAL_DATA-711302.pdf

1 个答案:

答案 0 :(得分:0)

这是一个知道cookie的WebCLient。积分转到Pavel Savara

public class WebClientEx : WebClient
{

    public WebClientEx() // Added to original code
    {
        this.container = new CookieContainer();
    }

    public WebClientEx(CookieContainer container)
    {
        this.container = container;
    }

    public CookieContainer CookieContainer
        {
            get { return container; }
            set { container= value; }
        }

    private CookieContainer container = new CookieContainer();

    protected override WebRequest GetWebRequest(Uri address)
    {
        WebRequest r = base.GetWebRequest(address);
        var request = r as HttpWebRequest;
        if (request != null)
        {
            request.CookieContainer = container;
        }
        return r;
    }

    protected override WebResponse GetWebResponse(WebRequest request, IAsyncResult result)
    {
        WebResponse response = base.GetWebResponse(request, result);
        ReadCookies(response);
        return response;
    }

    protected override WebResponse GetWebResponse(WebRequest request)
    {
        WebResponse response = base.GetWebResponse(request);
        ReadCookies(response);
        return response;
    }

    private void ReadCookies(WebResponse r)
    {
        var response = r as HttpWebResponse;
        if (response != null)
        {
            CookieCollection cookies = response.Cookies;
            container.Add(cookies);
        }
    }
}