为什么Nokogiri to_html URL的转义与URI.escape不同?

时间:2014-07-22 04:08:26

标签: ruby uri nokogiri

示例:

URI.escape 'http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883' 
=> "http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883" 

Nokogiri::HTML.fragment('<a href="http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883">test</a>').to_html
=> "<a href=\"http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_%245.3bn_deal_585883\">test</a>" 

正如您所看到的,Nokogiri将“$”编码为“%24”,其中URI.escape没有。{/ p>

1 个答案:

答案 0 :(得分:0)

由于to_xhtml能满足您的需求,为什么不使用它呢?

require 'nokogiri'

Nokogiri::HTML.fragment('<a href="http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883">test</a>').to_html
# => "<a href=\"http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_%245.3bn_deal_585883\">test</a>"

Nokogiri::HTML.fragment('<a href="http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883">test</a>').to_xhtml
# => "<a href=\"http://www.pmlive.com/pharma_news/mylan_buys_abbotts_non-us_generics_in_$5.3bn_deal_585883\">test</a>"