Question

我尝试使用ruby解析一些html内容。我使用以下代码：

require 'open-uri'


url = 'http://www.fooducate.com/appo#!page=browse&nav=0'
html = open(url)
IO.copy_stream(html,'test.html')

但我得到的只是没有内容的内容div：

<div id="page-content" class="content group">
</div>

这是解析器中的错误吗？我该如何解决这个问题？

Answer 1

如果您查看该div上方的评论，您会看到其他内容是通过JavaScript加载的。要检索它，您需要像浏览器一样运行页面的脚本，或以其他方式模拟第二次提取。

<!-- hook for any page content - JS Navigation object expects that -->
<div id="page-content" class="content group">
</div>

通过浏览器加载页面时，此行为可见。请注意，导航和布局加载，但您看到＆＃34;正在加载＆＃34;在内容填写之前几秒钟留言。