下载文件代码出现意外404错误

时间:2019-07-14 12:51:24

标签: vb.net file download http-status-code-404

我的代码循环下载文件,但是在最后一个文件下载之后,它会继续下载不存在的文件。网站显示重定向和404错误。

我是Visual Basic的新手,所以我在这里寻求帮助。

My.Computer.Network.DownloadFile(strFullUrlDownload, strFullSavePath, False, 1000)

404错误 重定向

Public Class Form1

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

    Dim strMainUrl As String = "http://jixxer.com/123/"
    Dim dt As DateTime = DateTime.Now
    Dim dtDate As String = dt.ToString("yyyy-MM-dd")
    Dim strSlash As String = "/"
    Dim strPdf As String = "pdf"
    Dim strDot As String = "."
    Dim strPage As String = "page"
    Dim strPageNbr As String = 1
    Dim intCounter As Integer = 1
    Dim strPageCounter As String = String.Format("{0:000}", intCounter)

    Dim strSavePath As String = "D:\dls\title1\"

    Dim strFullSavePath As String = strSavePath & strPageCounter & strDot & strPdf
    Dim strFullUrlDownload As String = strMainUrl & dtDate & strSlash & strPdf & strSlash & strPage & strPageNbr & strDot & strPdf

    Do Until strPageCounter = 200

        ' Downloads the resource with the specified URI to a local file.

        My.Computer.Network.DownloadFile(strFullUrlDownload, strFullSavePath, False, 1000)
        intCounter = intCounter + 1
        strPageNbr = strPageNbr + 1
        strPageCounter = String.Format("{0:000}", intCounter)

        strFullSavePath = strSavePath & strPageCounter & strDot & strPdf
        strFullUrlDownload = strMainUrl & dtDate & strSlash & strPdf & strSlash & strPage & strPageNbr & strDot & strPdf

    Loop

End Sub
End Class
Try

    'TRY to download the file using https first...
    My.Computer.Network.DownloadFile(New Uri("https://" & ServerAddress & WebLogoPath & Convert.ToString(RowArray(0)) & ".png"), Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) & "\" & AppDataFolder & PCLogoPath & Convert.ToString(RowArray(0)) & ".png", "", "", False, 500, True)
Catch ex_https As Exception
    'Unable to locate file or write file
    'If the operation timed out...
    If (ex_https.Message = "The operation has timed out") Then
        'Re-TRY to download the file using http instead, as a time out error may indicate that HTTPS is not supported.
        Try
            My.Computer.Network.DownloadFile(New Uri("http://" & ServerAddress & WebLogoPath & Convert.ToString(RowArray(0)) & ".png"), Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) & "\" & AppDataFolder & PCLogoPath & Convert.ToString(RowArray(0)) & ".png", "", "", False, 500, True)
        Catch ex_http As Exception
            'Most likely, the file doesn't exist on the server. Either way, we cannot obtain the file so we need to perform the same action, 
            'which is handled outside of this Try block.
        End Try
    Else
        'This is most likely a 404 error. Either way, we cannot obtain the file (and the connection is not timing out) - so
        'we need to perform the same action, which is handled outside of this Try block.
    End If
End Try

我只是将计数器设置为200进行测试并确保其正常工作。但是我知道我需要一种方法来退出错误,但是还不确定如何编码。感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

如果您不知道该远程目录中存储了多少文档,则必须在找不到页面时处理异常。
从站点请求资源时,总是有可能收到WebExceptions,因此无论如何您都应该处理这种情况。

我建议直接使用WebClient类而不是Network.DownloadFile(),如果您想显示进度的预定义UI(可能的话)可能会很方便,但是直接使用WebClient可以您可以使用async / await模式和WebClient.DownloadFileTaskAsync()方法随意执行下载(如果需要)。

另一个建议:使用一种方法来下载这些文件,因此您可以从代码中的任何位置调用它。您可以使用类或模块来存储方法,这样就不会使UI混乱,还可以轻松地在不同项目中重用这些类或模块,只需在项目中包含包含它们的文件即可。

您的代码可以进行如下修改(同步版本):

您需要将远程基本地址DownloadPdfPages传递给 http://jixxer.com/123 方法,该地址是文件存储的路径( filesPath < / strong>)。
第三个和第四个参数是可选的:
-如果您未指定 resourceName ,则假定使用Date.Now.ToString("yyyy-MM-dd")
-如果您未指定 startPage ,它将默认为1,并以page1.pdf进行转换(此处的示例要求从页面{{1 }}。

注意:我在这里使用String Interpolation 3
如果您的VB.Net版本不支持它,请改用String.Format()

$"page{startPage + pageCount}.pdf"

使用 Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click Dim numberOfPages = DownloadPdfPages("http://jixxer.com/123", "D:\dls\title1", "", 3) If numberOfPages > 0 Then MessageBox.Show($"Download completed. Number of pages: {numberOfPages}") Else MessageBox.Show("Download failed") End If End Sub Private Function DownloadPdfPages(baseAddress As String, filesPath As String, Optional resourceName As String = "", Optional startPage As Integer = 1) As Integer If String.IsNullOrEmpty(resourceName) Then resourceName = Date.Now.ToString("yyyy-MM-dd") Dim resourceAddr = Path.Combine(baseAddress, resourceName, "pdf") Dim pageCount = 0 Dim client = New WebClient() Try Do Dim documentName = $"page{startPage + pageCount}.pdf" Dim resourceUri = New Uri(Path.Combine(resourceAddr, documentName), UriKind.Absolute) Dim fileName = Path.Combine(filesPath, documentName) client.DownloadFile(resourceUri, fileName) pageCount += 1 Loop Catch ex As WebException If ex.Response IsNot Nothing Then Dim statusCode = DirectCast(ex.Response, HttpWebResponse).StatusCode If statusCode = HttpStatusCode.NotFound Then Return pageCount End If ElseIf ex.Status = WebExceptionStatus.ProtocolError AndAlso ex.Message.Contains("404") Then Return pageCount Else ' Log and/or ... Throw End If Return 0 Finally client.Dispose() End Try End Function 方法的异步版本。
仅需进行一些更改,请注意同时添加到Button.Click处理程序和 WebClient.DownloadFileTaskAsync() 方法中的 Async 关键字。
然后,将 DownloadPdfPagesAsync() 关键字用于等待方法完成而不会阻塞UI:

Await