ruby httpclient gem在下载文件时非常慢

时间:2011-10-27 00:46:25

标签: ruby-on-rails ruby jruby httpclient

我正在使用ruby httpclient gem来下载大文件。理想情况下,我希望能够下载2GB的文件。当我下载文件时,我不想在内存中加载文件的内容,原因很明显,所以我使用HTTPClient的get_content API,如下所示,然后将传递给块的块写入文件,

HTTPClient.new.get_content(url) do |chunk|
  puts "Downloaded chunk of size #{chunk.size}"
  file.write(chunk)
end

非常慢。一个10 MB的文件可能需要30秒。

块中的放置会保持打印出块大小,如下所示,

Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276
Downloaded chunk of size 4108
Downloaded chunk of size 12276

如果你看一下4108/12276的块大小,似乎可能是问题所在。块大小非常小。我无法弄清楚如何使块大小更大。

我使用过基于libcurl的Patron,下载速度非常快,但我现在不太热衷于引入对libcurl的依赖。

如何让HTTPClient下载更快?

更新10/27/2011

我尝试了@NaHi的两个建议,这就是我找到的。

当我将transparent_gzip_decompression选项设置为true时,我得到以下异常

  Zlib::StreamError: stream error: invalid window bits
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient/session.rb:652:in `get_body'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:1062:in `do_get_block'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:866:in `do_request'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:953:in `protect_keep_alive_disconnected'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:865:in `do_request'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:938:in `follow_redirect'
    from /Users/<user>/.rvm/gems/jruby-1.6.2@share/gems/httpclient-2.2.0.2/lib/httpclient.rb:577:in `get_content'

当我将标题设置为'Accept-Encoding'=&gt; 'gzip,deflate'我确实看到了性能的提升。之前需要24秒的7296502字节文件现在需要16秒。这有帮助。相比之下, patron 会在1.5秒内下载相同的文件。因此,我仍远未达到与 httpclient 相同的性能。

0 个答案:

没有答案