我正在尝试与设计糟糕的Web服务器进行通信,但我还是想处理它。问题是,当我提交登录表单时,它会尝试在URI中嵌入消息,这会使URI库停止。
服务器将我重定向到
/path/ConvolutedNameForMenuPage.menu?name=bmenu.P_MainMnu&msg=WELCOME+<b>Welcome,+Jonathan+Allard,+to+our+poorly+designed+Administrative+Systems!<%2Fb>Dec+07,+201102%3A27+PM
这是正确的,它试图在重定向URI中传递未解析的HTML代码,我应该请求它以便将其取回。 Sheesh,标准!
现在,URI库,显然是被这种糟糕的做法激动不已,感叹道
URI::InvalidURIError: bad URI(is not URI?): /path/ConvolutedNameForMenuPage.menu?name=bmenu.P_MainMnu&msg=WELCOME+<b>Welcome,+Jonathan+Allard,+to+our+poorly+designed+Administrative+Systems!<%2Fb>Dec+07,+201102%3A27+PM from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/uri/generic.rb:1202:in `rescue in merge'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/1.9.1/uri/generic.rb:1199:in `merge'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/mechanize-2.0.1/lib/mechanize/page/meta_refresh.rb:32:in `parse'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/mechanize-2.0.1/lib/mechanize/page/meta_refresh.rb:41:in `from_node'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/mechanize-2.0.1/lib/mechanize/page.rb:282:in `block in meta_refresh'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0/lib/nokogiri/xml/node_set.rb:239:in `block in each'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0/lib/nokogiri/xml/node_set.rb:238:in `upto'
from /home/jon/.rbenv/versions/1.9.3-p0/lib/ruby/gems/1.9.1/gems/nokogiri-1.5.0/lib/nokogiri/xml/node_set.rb:238:in `each'
我感到痛苦,URI
lib。
现在,如何捕获它,正确解析URI(或者只是完全删除它)并提交回来,好像什么也没发生?或者这是URI和Mechanize之间的错误吗?
答案 0 :(得分:0)
在对代码进行一些挖掘后,我发现了问题的来源。
正如我在#177中解释的那样:
在
/lib/mechanize/page/meta_refresh.rb:40
class Mechanize::Page::MetaRefresh def self.parse content, base_uri return unless content =~ CONTENT_REGEXP delay, refresh_uri = $1, $3 dest = base_uri dest += refresh_uri if refresh_uri # Oops! return delay, dest end
如果
URI::InvalidURIError
,引用的行会引发refresh_uri
包含非法符号(例如<
)。我不太清楚在哪里 但应该完成消毒。