Question

我在Ruby中使用Nokogiri编写了一个SAX解析器，我正在解析一个非常大的xml文件。奇怪的是，使用

data.each do |node| 
    if (node.name == "product" && node.node_type == Nokogiri::XML::Reader::TYPE_ELEMENT)
    #puts p.inspect
    p = Nokogiri::XML.parse(node.outer_xml)

    #puts p.xpath("//xmlns:step-quantity").text


    images = []
    sizes = []
    colors = []

    # If product is a master
    if p.xpath('//xmlns:image-group[@view-type="re_detail"]/xmlns:image/@path').count != 0
        # Product ID
        product_id = p.xpath('//@product-id').text # <--- THIS

返回

hbeu5010140440462983591054046298359112404629835912940462983591364046298359143404629835915040462983591674046298359174

for XML

  <product product-id="hbeu50101404">
    <ean/>
    <upc/>
    <unit/>
    <min-order-quantity>1</min-order-quantity>
    <step-quantity>1</step-quantity>

我不明白为什么？我会剪切字符串长度，但有些ID更长，我发现那种脏。有人可以帮帮我吗？： - ）

提前致谢

本杰明

Answer 1

很抱歉答案很简单：我有product-id="hbeu50101404"的其他属性，例如<variation>，这就是它收集字符串中所有内容的原因-.-

Ruby Nokogiri SAX解析字符串太长了

1 个答案: