mechanize和Ruby multipart / form-data - 内容传输编码

时间:2014-05-13 10:58:55

标签: ruby mechanize mechanize-ruby

我正在尝试使用mechanize 2.7.3向远程服务器发送multipart / form-data POST请求,以自动执行与远程服务器的某些交互。不幸的是,没有可用的<form>,所以我必须直接发布POST。幸运的是,机械化无论如何都能识别出我的目标,但是服务器并没有接受响应:

c:/RailsInstaller/Ruby1.9.3/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize/http/agent.rb:308: in `fetch': 400 => Net::HTTPBadRequest for http://REDACTED/api/resources -- unhandled response (Mechanize::ResponseCodeError)
    from c:/RailsInstaller/Ruby1.9.3/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize.rb:1281:in `post_form'
    from c:/RailsInstaller/Ruby1.9.3/lib/ruby/gems/1.9.1/gems/mechanize-2.7.3/lib/mechanize.rb:502:in `post'
    from thing.rb:38:in `upload'

以下是有关请求期间机械化日志记录输出的内容:

D, [2014-05-13T12:40:34.580906 #3456] DEBUG -- : query: "--cNPsKCeBSrPGUwxyjMze\r\nContent-Disposition: form-data; name=\"force_create\"\r\n\r\ntrue\r\n--cNPsKCeBSrPGUwxyjMze\r\nContent-Disposition: form-data; name=\"file\"; filename=\"test.wgt\"\r\nContent-Transfer-Encoding: binary\r\n\r\nREDACTED\r\n--cNPsKCeBSrPGUwxyjMze--\r\n"
I, [2014-05-13T12:40:34.581906 #3456]  INFO -- : Net::HTTP::Post: /api/resources
D, [2014-05-13T12:40:34.581906 #3456] DEBUG -- : request-header: accept => */*
D, [2014-05-13T12:40:34.581906 #3456] DEBUG -- : request-header: user-agent => Mechanize/2.7.3 Ruby/1.9.3p392 (http://github.com/sparklemotion/mechanize/)
D, [2014-05-13T12:40:34.581906 #3456] DEBUG -- : request-header: accept-encoding => gzip,deflate,identity
D, [2014-05-13T12:40:34.581906 #3456] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: accept-language => en-us,en;q=0.5
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: cookie => csrftoken=G4dZv6fxRMq0Y2h5yntDsJTFBeEKVFaU; sessionid=9oqufupdt6r620eyep4l4jcl6i2cxda5
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: host => REDACTED
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: referer => REDACTED/login
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: content-type => multipart/form-data; boundary=cNPsKCeBSrPGUwxyjMze
D, [2014-05-13T12:40:34.582906 #3456] DEBUG -- : request-header: content-length => 409

I, [2014-05-13T12:40:34.613906 #3456]  INFO -- : status: Net::HTTPBadRequest 1.1 400 BAD REQUEST
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: date => Tue, 13 May 2014 10:33:44 GMT
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: server => Apache/2.2.15 (CentOS)
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: vary => Accept-Language,Cookie,Accept-Encoding
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: content-language => en
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: content-type => text/plain; charset=utf-8
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: content-encoding => gzip
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: content-length => 37
D, [2014-05-13T12:40:34.613906 #3456] DEBUG -- : response-header: connection => close
D, [2014-05-13T12:40:34.614906 #3456] DEBUG -- : Read 37 bytes (37 total)
D, [2014-05-13T12:40:34.614906 #3456] DEBUG -- : gzip response

这就是我在Ruby中的表现:

def upload(file)
    agent.post(base_uri + RESOURCES_PATH, {
        :force_create => true,
        :file => File.new(file)
    }) do … end
end

与网站的每次其他互动都能正常工作(机械化实际上甚至首先登录,因此是cookie)。上传也可以在真实的浏览器中正常工作 - 唯一可辨别的区别是内容传输编码(八位字节流而不是二进制)。或许我在这里错过了一些东西?什么至关重要?

另外,我对相关远程服务器的说法不多。我只知道该网站正在运行某种基于Python的框架(Django,iirc)。

提前致谢,
曼努埃尔

1 个答案:

答案 0 :(得分:0)

我已经成功找到了罪魁祸首。由于我的开发机器是基于Windows的,因此这似乎是机械化(或其中一个依赖项)和Windows的问题。通过在b的第二个参数中指定File.new(二进制)部分,问题就会自行消失。 tl; dr:这里是工作代码片段现在的样子:

agent.post(base_uri + RESOURCES_PATH, {
    :force_create => true,
    :file => File.new(file, 'rb')  # <-- changed
})