WebClient下载Zip文件404重定向错误

时间:2015-02-23 03:36:57

标签: .net vb.net download zip webclient

我想从网站上下载5个zip文件。

  

http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part1_5.zip   http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part2_5.zip   http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part3_5.zip   http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part4_5.zip   http://download.companieshouse.gov.uk/BasicCompanyData-2015-02-01-part5_5.zip

但是,如果我使用下面的代码,我会收到404错误,我认为这是因为当我在浏览器中导航到页面时http://被删除,而不是在我使用我的代码时。

    Try
        Dim reg As String = """.*zip"""
        Dim list As New List(Of String)()
        Dim list2 As New List(Of String)()
        Dim myRegex As New Regex(reg, RegexOptions.None)
        TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower
        For Each myMatch As Match In myRegex.Matches(TextBox1.Text) 
            list.Add(myMatch.Value)
        Next
        Dim temp As String
        For Each i In list
            temp = i.Remove(0, 1)
            temp = temp.Remove(temp.Length - 1, 1)
            list2.Add(temp)
        Next
        Dim x As Integer = 1
        For Each i In list2
            Dim address As String = "http://download.companieshouse.gov.uk/" + i
            Dim des As String = Application.StartupPath + "\" + x.ToString + ".zip"
            Dim client As New System.Net.WebClient()
            client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")
            client.DownloadFile(address, des)
            x = x + 1
        Next

        For i As Integer = 1 To x Step 1
            Shell(Application.StartupPath + "\7za.exe e " + Application.StartupPath + "\" + x + ".zip")
        Next
        list.Clear()
    Catch ex As Exception
        MsgBox(ex.ToString)
    End Try

任何想法?

*更新:我已包含完整代码而不是代码段。

2 个答案:

答案 0 :(得分:3)

可能是您存储文件名数据的方式或方式。您的代码中还有一两个其他问题:

Private filList As New List(Of String) From {"BasicCompanyData-2015-02-01-part1_5.zip",
                                         "BasicCompanyData-2015-02-01-part2_5.zip",
                                         "BasicCompanyData-2015-02-01-part3_5.zip",
                                         "BasicCompanyData-2015-02-01-part4_5.zip",
                                         "BasicCompanyData-2015-02-01-part5_5.zip"}

然后在其他地方点击按钮:

Dim destPath As String = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)
Dim destFile As String
Dim baseURL As String = "http://download.companieshouse.gov.uk/"
Dim thisURL As String

Using wc As New WebClient
    wc.Headers.Add("user-agent", 
            "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)")

    For Each f As String In filList
        thisURL = baseURL & f
        destFile = Path.Combine(destPath, f)
        wc.DownloadFile(thisURL, destFile)
    Next

End Using
  1. USING块确保WebClient将被关闭,处置并释放资源。
  2. 在VS中,使用Application.StartupPath可以使用,但作为部署的应用,可能会在将应用安装到Program Files...时失败,因为您的应用可能无法在那里写。使用Environment.GetFolderPath获取MyDocuments
  3. 等文件夹
  4. 此版本保留原始文件名,以便在您执行其他文件时,它们不会互相覆盖(使用App StartupPath时可能出现的另一个问题)。

答案 1 :(得分:0)

问题在于我在页面上找到链接的方式,我正在将网页读入文本框并将.toLower应用于它,因此当我提取它时,网址是错误的。

问题专栏:

TextBox1.Text = New System.Net.WebClient().DownloadString("http://download.companieshouse.gov.uk/en_output.html").ToLower