我的代码循环下载文件,但是在最后一个文件下载之后,它会继续下载不存在的文件。网站显示重定向和404错误。
我是Visual Basic的新手,所以我在这里寻求帮助。
My.Computer.Network.DownloadFile(strFullUrlDownload, strFullSavePath, False, 1000)
404错误 重定向
Public Class Form1
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim strMainUrl As String = "http://jixxer.com/123/"
Dim dt As DateTime = DateTime.Now
Dim dtDate As String = dt.ToString("yyyy-MM-dd")
Dim strSlash As String = "/"
Dim strPdf As String = "pdf"
Dim strDot As String = "."
Dim strPage As String = "page"
Dim strPageNbr As String = 1
Dim intCounter As Integer = 1
Dim strPageCounter As String = String.Format("{0:000}", intCounter)
Dim strSavePath As String = "D:\dls\title1\"
Dim strFullSavePath As String = strSavePath & strPageCounter & strDot & strPdf
Dim strFullUrlDownload As String = strMainUrl & dtDate & strSlash & strPdf & strSlash & strPage & strPageNbr & strDot & strPdf
Do Until strPageCounter = 200
' Downloads the resource with the specified URI to a local file.
My.Computer.Network.DownloadFile(strFullUrlDownload, strFullSavePath, False, 1000)
intCounter = intCounter + 1
strPageNbr = strPageNbr + 1
strPageCounter = String.Format("{0:000}", intCounter)
strFullSavePath = strSavePath & strPageCounter & strDot & strPdf
strFullUrlDownload = strMainUrl & dtDate & strSlash & strPdf & strSlash & strPage & strPageNbr & strDot & strPdf
Loop
End Sub
End Class
Try
'TRY to download the file using https first...
My.Computer.Network.DownloadFile(New Uri("https://" & ServerAddress & WebLogoPath & Convert.ToString(RowArray(0)) & ".png"), Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) & "\" & AppDataFolder & PCLogoPath & Convert.ToString(RowArray(0)) & ".png", "", "", False, 500, True)
Catch ex_https As Exception
'Unable to locate file or write file
'If the operation timed out...
If (ex_https.Message = "The operation has timed out") Then
'Re-TRY to download the file using http instead, as a time out error may indicate that HTTPS is not supported.
Try
My.Computer.Network.DownloadFile(New Uri("http://" & ServerAddress & WebLogoPath & Convert.ToString(RowArray(0)) & ".png"), Environment.GetFolderPath(Environment.SpecialFolder.ApplicationData) & "\" & AppDataFolder & PCLogoPath & Convert.ToString(RowArray(0)) & ".png", "", "", False, 500, True)
Catch ex_http As Exception
'Most likely, the file doesn't exist on the server. Either way, we cannot obtain the file so we need to perform the same action,
'which is handled outside of this Try block.
End Try
Else
'This is most likely a 404 error. Either way, we cannot obtain the file (and the connection is not timing out) - so
'we need to perform the same action, which is handled outside of this Try block.
End If
End Try
我只是将计数器设置为200进行测试并确保其正常工作。但是我知道我需要一种方法来退出错误,但是还不确定如何编码。感谢任何帮助。
答案 0 :(得分:0)
如果您不知道该远程目录中存储了多少文档,则必须在找不到页面时处理异常。
从站点请求资源时,总是有可能收到WebExceptions,因此无论如何您都应该处理这种情况。
我建议直接使用WebClient类而不是Network.DownloadFile()
,如果您想显示进度的预定义UI(可能的话)可能会很方便,但是直接使用WebClient可以您可以使用async / await模式和WebClient.DownloadFileTaskAsync()方法随意执行下载(如果需要)。
另一个建议:使用一种方法来下载这些文件,因此您可以从代码中的任何位置调用它。您可以使用类或模块来存储方法,这样就不会使UI混乱,还可以轻松地在不同项目中重用这些类或模块,只需在项目中包含包含它们的文件即可。
您的代码可以进行如下修改(同步版本):
您需要将远程基本地址DownloadPdfPages
传递给 http://jixxer.com/123
方法,该地址是文件存储的路径( filesPath
< / strong>)。
第三个和第四个参数是可选的:
-如果您未指定 resourceName
,则假定使用Date.Now.ToString("yyyy-MM-dd")
,
-如果您未指定 startPage
,它将默认为1
,并以page1.pdf
进行转换(此处的示例要求从页面{{1 }}。
注意:我在这里使用String Interpolation: 3
。
如果您的VB.Net版本不支持它,请改用String.Format()。
$"page{startPage + pageCount}.pdf"
使用 Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim numberOfPages = DownloadPdfPages("http://jixxer.com/123", "D:\dls\title1", "", 3)
If numberOfPages > 0 Then
MessageBox.Show($"Download completed. Number of pages: {numberOfPages}")
Else
MessageBox.Show("Download failed")
End If
End Sub
Private Function DownloadPdfPages(baseAddress As String, filesPath As String, Optional resourceName As String = "", Optional startPage As Integer = 1) As Integer
If String.IsNullOrEmpty(resourceName) Then resourceName = Date.Now.ToString("yyyy-MM-dd")
Dim resourceAddr = Path.Combine(baseAddress, resourceName, "pdf")
Dim pageCount = 0
Dim client = New WebClient()
Try
Do
Dim documentName = $"page{startPage + pageCount}.pdf"
Dim resourceUri = New Uri(Path.Combine(resourceAddr, documentName), UriKind.Absolute)
Dim fileName = Path.Combine(filesPath, documentName)
client.DownloadFile(resourceUri, fileName)
pageCount += 1
Loop
Catch ex As WebException
If ex.Response IsNot Nothing Then
Dim statusCode = DirectCast(ex.Response, HttpWebResponse).StatusCode
If statusCode = HttpStatusCode.NotFound Then
Return pageCount
End If
ElseIf ex.Status = WebExceptionStatus.ProtocolError AndAlso ex.Message.Contains("404") Then
Return pageCount
Else
' Log and/or ...
Throw
End If
Return 0
Finally
client.Dispose()
End Try
End Function
方法的异步版本。
仅需进行一些更改,请注意同时添加到Button.Click处理程序和 WebClient.DownloadFileTaskAsync()
方法中的 Async
关键字。
然后,将 DownloadPdfPagesAsync()
关键字用于等待方法完成而不会阻塞UI:
Await