我想知道有没有办法只下载.rar或.zip文件的一部分而不下载整个文件? 有一个包含文件A,B,C和D的zip文件。 我只需要A.我可以以某种方式调整下载以仅下载A或者如果可能的话在服务器中提取文件并仅获得A?
答案 0 :(得分:11)
诀窍是做塞尔吉奥所建议的而不用手动做。如果您通过HTTP支持的虚拟文件系统挂载zip文件,然后在其上使用标准的unzip命令,这很容易。通过这种方式,解压缩实用程序的I / O调用被转换为HTTP范围获取,这意味着只需要通过网络传输zip的块。
以下是使用HTTPFS的Linux的示例,这是一个非常轻量级的虚拟文件系统(它使用FUSE)。 Windows也有类似的工具。
获取/构建httpfs:
$ wget http://sourceforge.net/projects/httpfs/files/httpfs/1.06.07.02
$ tar -xjf httpfs_1.06.07.10.tar.bz2
$ rm httpfs
$ ./make_httpfs
挂载远程zip文件并从中提取一个文件:
$ mkdir mount_pt
$ sudo ./httpfs http://server.com/zipfile.zip mount_pt
$ sudo ls mount_pt
zipfile.zip
$ sudo unzip -p mount_pt/zipfile.zip the_file_I_want.txt > the_file_I_want.txt
$ sudo umount mount_pt
当然,您也可以使用命令行旁边的其他任何工具。 (我需要sudo,因为看起来FUSE在我的机器上就是这样设置的,你不应该需要它)
我知道这是一个老问题,这是其他人遇到这个问题。
答案 1 :(得分:7)
在某种程度上,是的,你可以。
ZIP file format说有一个“中心目录”。基本上,这是一个表,用于存储归档中的文件以及它们具有的偏移量。
因此,使用Content-Range,您可以从末尾下载部分文件(中心目录是zip文件中的最后一项),并尝试识别其中的中心目录。如果你成功了,那么你就知道了文件列表和偏移量,所以你可以继续分别获取这些块并自行解压缩。
这种方法非常容易出错,无法保证正常工作。但一般来说黑客攻击也是如此: - )
另一种可能的方法是为此构建自定义服务器(有关详细信息,请参阅@pst's answer)。
答案 2 :(得分:3)
普通人有几种方法可以从压缩的ZIP文件下载单个文件,遗憾的是,这些方法并不常见。有一些开源工具和在线Web服务,包括:
答案 3 :(得分:1)
你可以使用FDM,它支持Zip文件部分下载: 免费下载管理器允许您只下载zip文件的必要部分。
答案 4 :(得分:0)
我认为Sergei Tulentsevs的想法很棒。
但是,如果可以控制服务器 - 例如可以部署自定义代码 - 然后它是一个相当简单的操作(在方案中:)来映射/处理请求,提取ZIP存档的相关部分,并在HTTP流中发回数据。
请求可能如下所示:
http://foo.bar/myfile.zip_a.jpeg
这意味着从“myfile.zip”中提取 - 并返回 - “a.jpeg”。
(我故意选择这种愚蠢的格式,以便浏览器在出现时可能会选择“myfile.zip_a.jpeg”作为下载对话框中的名称。)
当然,如何实现取决于服务器/语言/框架,可能已经存在支持类似操作的现有解决方案(但我不知道)。
快乐的编码。
答案 5 :(得分:0)
相反,请使用Google Docs的读者。转到此链接 - https://docs.google.com/viewer?url=http://file.zip并更改zip文件的地址。它可以打开zip和rar文件
答案 6 :(得分:0)
Can you arrange for your file to appear in the back of the zip?
Download 100k:
$ curl -r -100000 https://www.keepassx.org/releases/2.0.2/KeePassX-2.0.2.zip -o tail.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 97k 100 97k 0 0 84739 0 0:00:01 0:00:01 --:--:-- 84817
Check what files we did get:
$ unzip -t tail.zip
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
error [tail.zip]: attempt to seek before beginning of zipfile
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
error [tail.zip]: attempt to seek before beginning of zipfile
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
error [tail.zip]: attempt to seek before beginning of zipfile
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
error [tail.zip]: attempt to seek before beginning of zipfile
(please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)
testing: KeePassX-2.0.2/share/translations/keepassx_uk.qm OK
testing: KeePassX-2.0.2/share/translations/keepassx_zh_CN.qm OK
testing: KeePassX-2.0.2/share/translations/keepassx_zh_TW.qm OK
testing: KeePassX-2.0.2/zlib1.dll OK
At least one error was detected in tail.zip.
Then extract the last file:
$ unzip tail.zip KeePassX-2.0.2/zlib1.dll
Archive: tail.zip
error [tail.zip]: missing 7751495 bytes in zipfile
(attempting to process anyway)
inflating: KeePassX-2.0.2/zlib1.dll
答案 7 :(得分:0)
基于良好的输入,我在 Powershell 中编写了一个代码片段来展示它是如何工作的:
# demo code downloading a single DLL file from an online ZIP archive
# and extracting the DLL into memory to mount it finally to the main process.
cls
Remove-Variable * -ea 0
# definition for the ZIP archive, the file to be extracted and the checksum:
$url = 'https://github.com/sshnet/SSH.NET/releases/download/2020.0.1/SSH.NET-2020.0.1-bin.zip'
$sub = 'net40/Renci.SshNet.dll'
$md5 = '5B1AF51340F333CD8A49376B13AFCF9C'
# prepare HTTP client:
Add-Type -AssemblyName System.Net.Http
$handler = [System.Net.Http.HttpClientHandler]::new()
$client = [System.Net.Http.HttpClient]::new($handler)
# get the length of the ZIP archive:
$req = [System.Net.HttpWebRequest]::Create($url)
$req.Method = 'HEAD'
$length = $req.GetResponse().ContentLength
$zip = [byte[]]::new($length)
# get the last 10k:
# how to get the correct length of the central ZIP directory here?
$start = $length-10kb
$end = $length-1
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$last10kb = $result.content.ReadAsByteArrayAsync().Result
$last10kb.CopyTo($zip, $start)
# get the block containing the DLL file:
# how to get the exact file-offset from the ZIP directory?
$start = $length-3537kb
$end = $length-3201kb
$client.DefaultRequestHeaders.Clear()
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$block = $result.content.ReadAsByteArrayAsync().Result
$block.CopyTo($zip, $start)
# extract the DLL file from archive:
Add-Type -AssemblyName System.IO.Compression
$stream = [System.IO.Memorystream]::new()
$stream.Write($zip,0,$zip.Length)
$archive = [System.IO.Compression.ZipArchive]::new($stream)
$entry = $archive.GetEntry($sub)
$bytes = [byte[]]::new($entry.Length)
[void]$entry.Open().Read($bytes, 0, $bytes.Length)
# check MD5:
$prov = [Security.Cryptography.MD5CryptoServiceProvider]::new().ComputeHash($bytes)
$hash = [string]::Concat($prov.foreach{$_.ToString("x2")})
if ($hash -ne $md5) {write-host 'dll has wrong checksum.' -f y ;break}
# load the DLL:
[void][System.Reflection.Assembly]::Load($bytes)
# use the single demo-call from the DLL:
$test = [Renci.SshNet.NoneAuthenticationMethod]::new('test')
'done.'