VB.NET - 在内存中下载zip并从内存中提取文件到磁盘

时间:2011-07-07 15:31:39

标签: vb.net httpwebresponse dotnetzip

尽管找到了例子,但我遇到了一些麻烦。我认为这可能是一个编码问题,但我只是不确定。我试图从https服务器下载文件,使用cookie(因此我使用的是httpwebrequest)。我正在调试打印要检查的流的容量,但输出[raw]文件看起来不同。尝试过其他编码无济于事。

代码:

    Sub downloadzip(strURL As String, strDestDir As String)

    Dim request As HttpWebRequest
    Dim response As HttpWebResponse

    request = Net.HttpWebRequest.Create(strURL)
    request.UserAgent = strUserAgent
    request.Method = "GET"
    request.CookieContainer = cookieJar
    response = request.GetResponse()

    If response.ContentType = "application/zip" Then
        Debug.WriteLine("Is Zip")
    Else
        Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
        Exit Sub
    End If

    Dim intLen As Int64 = response.ContentLength
    Debug.WriteLine("response length: " + intLen.ToString)

    Using srStreamRemote As StreamReader = New StreamReader(response.GetResponseStream(), Encoding.Default)
        'Using ms As New MemoryStream(intLen)
        Dim fullfile As String = srStreamRemote.ReadToEnd

        Dim memstream As MemoryStream = New MemoryStream(New UnicodeEncoding().GetBytes(fullfile))

        'test write out to flie
        Dim data As Byte() = memstream.ToArray()
        Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
            filestrm.Write(data, 0, data.Length)
        End Using

        Debug.WriteLine("Memstream capacity " + memstream.Capacity.ToString)
        'Dim strData As String = srStreamRemote.ReadToEnd
        memstream.Seek(0, 0)
        Dim buffer As Byte() = New Byte(2048) {}
        Using zip As New ZipInputStream(memstream)
            Debug.WriteLine("zip stream cap " + zip.Length.ToString)
            zip.Seek(0, 0)
            Dim e As ZipEntry

            Dim flag As Boolean = True
            Do While flag ' daft, but won't assign e=zip... tries to evaluate
                e = zip.GetNextEntry
                If IsNothing(e) Then
                    flag = False
                    Exit Do
                Else
                    e.UseUnicodeAsNecessary = True
                End If

                If Not e.IsDirectory Then
                    Debug.WriteLine("Writing out " + e.FileName)
                    '    e.Extract(strDestDir)

                    Using output As FileStream = File.Open(Path.Combine(strDestDir, e.FileName), _
                                                          FileMode.Create, FileAccess.ReadWrite)
                        Dim n As Integer
                        Do While (n = zip.Read(buffer, 0, buffer.Length) > 0)
                            output.Write(buffer, 0, n)
                        Loop
                    End Using

                End If
            Loop
        End Using
        'End Using
    End Using 'srStreamRemote.Close()
    response.Close()
End Sub

所以我下载了正确的大小文件,但是dotnetzip无法识别它,并且复制出来的文件是不完整/无效的拉链。我今天大部分时间都在这上面,并准备放弃。

3 个答案:

答案 0 :(得分:4)

我认为答案将是解决问题,并可能改变代码中的几个方面。

例如,让我们摆脱将响应流转换为字符串:

Dim memStream As MemoryStream
Using rdr As System.IO.Stream = response.GetResponseStream
    Dim count = Convert.ToInt32(response.ContentLength)
    Dim buffer = New Byte(count) {}
    Dim bytesRead As Integer
    Do
        bytesRead += rdr.Read(buffer, bytesRead, count - bytesRead)
    Loop Until bytesRead = count
    rdr.Close()
    memStream = New MemoryStream(buffer)
End Using

接下来,有一种更简单的方法将内存流的内容输出到文件。考虑一下你的代码

Dim data As Byte() = memstream.ToArray()
Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    filestrm.Write(data, 0, data.Length)
End Using

可以替换为

Using filestrm As FileStream = New FileStream("c:\temp\debug.zip", FileMode.Create)
    memstream.WriteTo(filestrm)
End Using

这消除了将内存流转移到另一个字节数组,然后将字节数组向下推送到流中的需要,实际上内存流可以将数据直接传输到文件(通过文件流),从而节省了中间缓冲区

我承认我没有使用过您正在使用的Zip /压缩库,但是通过上述修改,您已经删除了流,字节数组,字符串等之间不必要的传输,并希望消除编码问题有。。

尝试一下,让我们知道你是如何进行的。考虑尝试打开您保存的文件(“C:\ temp \ debug.zip”)以查看它是否列为已损坏。如果没有,那么你至少知道代码中的那个,它工作正常。

答案 1 :(得分:2)

我以为我会将自己的完整工作解决方案发布到我自己的问题上,它结合了我所拥有的两个优秀回复,谢谢你们。

Sub downloadzip(strURL As String, strDestDir As String)
    Try

        Dim request As HttpWebRequest
        Dim response As HttpWebResponse

        request = Net.HttpWebRequest.Create(strURL)
        request.UserAgent = strUserAgent
        request.Method = "GET"
        request.CookieContainer = cookieJar
        response = request.GetResponse()

        If response.ContentType = "application/zip" Then
            Debug.WriteLine("Is Zip")
        Else
            Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
            Exit Sub
        End If

        Dim intLen As Int32 = response.ContentLength
        Debug.WriteLine("response length: " + intLen.ToString)

        Dim memStream As MemoryStream
        Using stmResponse As IO.Stream = response.GetResponseStream()
            'Using ms As New MemoryStream(intLen)

            Dim buffer = New Byte(intLen) {}
            'Dim memstream As MemoryStream = New MemoryStream(buffer)

            Dim bytesRead As Integer
            Do
                bytesRead += stmResponse.Read(buffer, bytesRead, intLen - bytesRead)
            Loop Until bytesRead = intLen

            memStream = New MemoryStream(buffer)

            Dim res As Boolean = False
            res = ZipExtracttoFile(memStream, strDestDir)

        End Using 'srStreamRemote.Close()
        response.Close()



    Catch ex As Exception
        'to do :)
    End Try
End Sub


Function ZipExtracttoFile(strm As MemoryStream, strDestDir As String) As Boolean

    Try
        Using zip As ZipFile = ZipFile.Read(strm)
            For Each e As ZipEntry In zip

                e.Extract(strDestDir)

            Next
        End Using
    Catch ex As Exception
        Return False
    End Try

    Return True

End Function

答案 2 :(得分:1)

您可以下载到MemoryStream中,然后检查它:

Public Sub Download(url as String)
    Dim req As HttpWebRequest = System.Net.WebRequest.Create(url)
    req.Method = "GET"
    Dim resp As HttpWebResponse = req.GetResponse()
    If resp.ContentType = "application/zip" Then
        Console.Error.Write("The result is a zip file.")
        Dim length As Int64 = resp.ContentLength
        If length = -1 Then
            Console.Error.WriteLine("... length unspecified")
            length = 16 * 1024
        Else
            Console.Error.WriteLine("... has length {0}", length)
        End If
        Dim ms As New MemoryStream
        CopyStream(resp.GetResponseStream(), ms)  '' **see note below!!!!
        '' list contents of the zip file
        ms.Seek(0,SeekOrigin.Begin)
        Using zip As ZipFile = ZipFile.Read (ms)
            Dim e As ZipEntry
            Console.Error.WriteLine("Entries:")
            Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                    "Name", "compressed", "uncompressed")
            Console.Error.WriteLine("----------------------------------------------------")
            For Each e In zip
                Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                        e.FileName, _
                                        e.CompressedSize, _
                                        e.UncompressedSize)
            Next
        End Using
    Else
        Console.Error.WriteLine("The result is Not a zip file.")
        CopyStream(resp.GetResponseStream(), Console.OpenStandardOutput)
    End If
End Sub


Private Shared Sub CopyStream(input As Stream, output As Stream)
    Dim buffer(32768 - 1) As Byte
    Dim n As Int32
    Do
        n = input.Read(buffer, 0, buffer.Length)
        If n = 0 Then Exit Do
            output.Write(buffer, 0, n)
    Loop
End Sub

修改

只需一条注意事项 - 如果Zip文件非常大,我不建议使用此代码(此方法)。 “非常大”有多大?当然,这取决于。我上面建议的代码将文件下载到内存流中,这当然意味着zip文件的全部内容都保存在内存中。如果它是一个28kb的zip文件,那就没问题了。但如果它是一个2GB的zip文件,那么你可能会有一个大问题。

在这种情况下,您需要将其流式传输到磁盘上的临时文件,而不是流式传输到MemoryStream。我将把它作为读者的练习。

以上内容适用于“合理大小”的zip文件,其中“合理”取决于您的机器配置和应用场景。