我正在编写一个内部应用程序来监控我们亚马逊列出的产品,并坚持使用Amazon Product Advertising API (amazon-ecs) Ruby gem。我想在屏幕上显示类似于此的浏览节点:
“根类别” - >所有子类别 - >最终类别(项目所在的实际类别)
<BrowseNode>
<BrowseNodeId>770071031</BrowseNodeId>
<Name>Robotic Vacuums</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>125698031</BrowseNodeId>
<Name>Vacuums</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3147711</BrowseNodeId>
<Name>Vacuums & Floor Care</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3576359031</BrowseNodeId>
<Name>Vacuuming, Cleaning & Ironing</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>391784011</BrowseNodeId>
<Name>Kitchen & Home Appliances</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3147411</BrowseNodeId>
<Name>Categories</Name>
<IsCategoryRoot>1</IsCategoryRoot>
<Ancestors>
<BrowseNode>
<BrowseNodeId>11052681</BrowseNodeId>
<Name>Kitchen & Home</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>11052591</BrowseNodeId>
<Name>Home & Garden</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3146281</BrowseNodeId>
<Name>Home & Garden</Name>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
所以上面应该显示为:
Home&amp;花园 - &gt;厨房&amp;首页 - &gt;厨房&amp;家用电器 - &gt;吸尘,清洁和熨烫 - &gt;吸尘器&amp;地板护理 - &gt;真空
我尝试了get_array
和get_hash
,但这些都返回了一个长字符串的值。
有没有办法轻松地使用amazon-ecs gem做我想做的事情,还是应该将字符串作为XML处理并尝试相应地循环?
答案 0 :(得分:1)
这是一种简单的方法。如果没有任何标准显示您如何确定哪些<Name>
节点是可接受的,则返回所有节点:
require 'nokogiri'
xml = <<EOT
<BrowseNode>
<BrowseNodeId>770071031</BrowseNodeId>
<Name>Robotic Vacuums</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>125698031</BrowseNodeId>
<Name>Vacuums</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3147711</BrowseNodeId>
<Name>Vacuums & Floor Care</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3576359031</BrowseNodeId>
<Name>Vacuuming, Cleaning & Ironing</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>391784011</BrowseNodeId>
<Name>Kitchen & Home Appliances</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3147411</BrowseNodeId>
<Name>Categories</Name>
<IsCategoryRoot>1</IsCategoryRoot>
<Ancestors>
<BrowseNode>
<BrowseNodeId>11052681</BrowseNodeId>
<Name>Kitchen & Home</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>11052591</BrowseNodeId>
<Name>Home & Garden</Name>
<Ancestors>
<BrowseNode>
<BrowseNodeId>3146281</BrowseNodeId>
<Name>Home & Garden</Name>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
</Ancestors>
</BrowseNode>
EOT
doc = Nokogiri::XML(xml)
这是查找节点的代码:
doc.search('Name').map(&:text).reverse.uniq.join(' -> ')
# => "Home & Garden -> Kitchen & Home -> Categories -> Kitchen & Home Appliances -> Vacuuming, Cleaning & Ironing -> Vacuums & Floor Care -> Vacuums -> Robotic Vacuums"
有些条目是重复的,因此uniq
会清除它们。
参见&#34; How to avoid joining all text from Nodes when scraping&#34;还