如何在页面正文中搜索并强制进行编码转换

时间:2017-04-11 10:24:31

标签: ruby encoding utf-8 mechanize

我的代码非常简单:

require 'rubygems'
require 'mechanize'

URL = 'http://yandex.ru'

agent = Mechanize.new
page = agent.get(URL)

# page.encoding => UTF-8
# page.body.encoding => ASCII-8BIT

page.body.include?("Карты")

在该代码的最后一行,Ruby返回了一个错误:

in `include?': incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)

" How to get Mechanize to auto-convert body to UTF8?"的解决方案没有帮助。我该怎么做才能解决它?

1 个答案:

答案 0 :(得分:1)

您可以使用force_encoding方法,如下所示:

agent.page.body.force_encoding('utf-8')