用Nokogiri解析

时间:2013-07-13 20:51:04

标签: ruby nokogiri

我正在使用Nokogiri解析HTML,然后获取这些类型元素。

<li data-item="{"title":"where is title","slug":"about some",
    "has_many_images":false,"show_image":"abbxb","created_at":1373737401,
    "show_attr":{"value":"150"},
    "location":"Alabama",
    "category":"Table",
    "is_business":false}">

    //here other many more
</li>

现在我想得到这个data-item,我正在使用:

 page.css("li[data-item]")[0]

我得到的是这样的东西:

#<Nokogiri::XML::Element:0x14fc250 name="li" attributes=[#<Nokogiri::XML::Attr:0x14fc178 name="class" value="item">,等等......

但我想这样:

"{"title":"where is title","slug":"about some",
        "has_many_images":false,"show_image":"abbxb","created_at":1373737401,
        "show_attr":{"value":"150"},
        "location":"Alabama",
        "category":"Table",
        "is_business":false}"

有什么建议吗?

1 个答案:

答案 0 :(得分:2)

您可以使用以下选项获取该属性:

page.at_xpath("//li[1]/@data-item").content

修改

在@ Priti的要求下进行更完整的演示:

body = %Q{     
  <body>
    <li data-item='{"title":"where is title","slug":"about some",
      "has_many_images":false,"show_image":"abbxb","created_at":1373737401,
      "show_attr":{"value":"150"},
      "location":"Alabama",
      "category":"Table",
      "is_business":false}'>
    </li>
  </body>
}
page = Nokogiri::XML(body)
result = page.at_xpath("//li[1]/@data-item").content
# "{\"title\":\"where is title\",\"slug\":\"about some\",         \"has_many_images\":false,\"show_image\":\"abbxb\",\"created_at\":1373737401,         \"show_attr\":{\"value\":\"150\"},         \"location\":\"Alabama\",         \"category\":\"Table\",         \"is_business\":false}"