我正在尝试使用httpclient下载页面并使用oga(https://github.com/YorickPeterse/oga)解析它
我的程序如下:
require 'httpclient'
require 'oga'
url = 'http://stackoverflow.com/questions/1496096/is-there-a-limit-to-the-length-of-html-attributes'
c = HTTPClient.new
content = c.get_content(url)
document = Oga.parse_html(content)
我收到此错误:
LL::ParserError: Unexpected end of input, expected element closing tag instead on line 431
parser_error at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:255
each_token at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:231
parse at org/libll/Driver.java:303
parse at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/xml/parser.rb:262
parse_html at /home/binaryplease/.rvm/gems/jruby-1.7.19/gems/oga-0.3.1-java/lib/oga/oga.rb:25
(root) at test.rb:12
我确认httpclient正在正确下载,文件没有结束。我也试过其他链接,有些工作,但大多数都给了我这个错误。 一般来说,较小的页面似乎工作得很好
库存在问题或我发错了吗?