在Ruby中解析复杂的URL

时间:2010-10-13 19:56:14

标签: ruby-on-rails ruby ruby-on-rails-3 uri

我想在此网址中检索“q”的值:

http://www.google.com/url?sa=X&q=http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013&ct=ga&cad=:s7:f1:v1:d2:i0:lt:e0:p0:t1286988171:&cd=yQoOdKUFTLo&usg=AFQjCNEg2inHF8hXGEvG-TxMQyMx7YGHkA

如果我使用它:

uri = URI.parse("http://www.google.com/url?sa=X&q=http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013&ct=ga&cad=:s7:f1:v1:d2:i0:lt:e0:p0:t1286988171:&cd=yQoOdKUFTLo&usg=AFQjCNEg2inHF8hXGEvG-TxMQyMx7YGHkA")

uri_params = CGI.parse(uri.query)

uri_params['q']

我收到此错误:

URI::InvalidURIError: bad URI(is not URI?)

谢谢!

1 个答案:

答案 0 :(得分:3)

似乎对我有用

ruby-1.8.7-p249 > require 'uri'
 => true 
ruby-1.8.7-p249 > require 'cgi'
 => true 
ruby-1.8.7-p249 > uri = URI.parse("http://www.google.com/url?sa=X&q=http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013&ct=ga&cad=:s7:f1:v1:d2:i0:lt:e0:p0:t1286988171:&cd=yQoOdKUFTLo&usg=AFQjCNEg2inHF8hXGEvG-TxMQyMx7YGHkA")
 => #<URI::HTTP:0x10127b288 URL:http://www.google.com/url?sa=X&q=http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013&ct=ga&cad=:s7:f1:v1:d2:i0:lt:e0:p0:t1286988171:&cd=yQoOdKUFTLo&usg=AFQjCNEg2inHF8hXGEvG-TxMQyMx7YGHkA> 
ruby-1.8.7-p249 > uri_params = CGI.parse(uri.query)
 => {"cd"=>["yQoOdKUFTLo"], "sa"=>["X"], "cad"=>[":s7:f1:v1:d2:i0:lt:e0:p0:t1286988171:"], "ct"=>["ga"], "q"=>["http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013"], "usg"=>["AFQjCNEg2inHF8hXGEvG-TxMQyMx7YGHkA"]} 
ruby-1.8.7-p249 > uri_params['q']
 => ["http://nashville.broadwayworld.com/article/Just_in_time_for_Halloween_Circle_Players_does_JEKYLL_HYDE_20101013"]