Nokogiri在XML中抓取错误的节点数据?

时间:2014-02-25 13:25:56

标签: ruby nokogiri ruby-2.0

我正在使用Nokogiri来解析XML文件但它无法正常工作。

当我尝试从三个级别的节点中获取时,它会从该类型的第一个节点获取数据。我已经调试了它,它所在的节点应该是获取我需要的数据的正确节点,但它仍然是从该类型的第一个节点拉出数据。

不在更高级别节点中的项目正在输出到文件中,但是当我开始向上移动树时,它正在将错误的数据写入文件。

require 'nokogiri'

f = File.new("grammystext.txt", "w+")
x = File.open("items.xml", "r")
doc = Nokogiri::XML(x)
x.close

doc.xpath('//CWItemExport//ItemExportData//CWItem//ProductID//ItemColor//ItemSize').each_with_index do |item, i|
  f << item.parent.parent.parent.at_xpath('//CWVendor//VendorCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//CWVendor//VendorName').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemStyle').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDescription').content + ", "
  f << item.parent.parent.parent.at_xpath('//TaxID//TaxIDCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDepartment//ItemDeptCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDepartment//ItemDeptName').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemClass//ItemClassCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemClass//ItemClassName').content + ", "

  f << item.attr("MainSize") + ", "
  f << item.at_xpath('Sku').content + ", "
  f << item.at_xpath('//ReplacementCost').content + ", "
  f << item.at_xpath('//CurrentRetail').content + "\n"

  puts item.parent.parent.parent if i == 6

  break if i == 7
end

f.close

XML:

<CWItem action="New">
  <CWVendor>
   <VendorCode>5TH</VendorCode>
   <VendorName>5TH SUN</VendorName>
    <VendorAddress />
    <VendorAddress2 />
    <VendorCity />
    <VendorZip />
    <VendorPhone />
  </CWVendor>
 <ItemStyle>AMM024-B105</ItemStyle>
 <ItemDescription>CALVERY</ItemDescription>
  <ItemBoolPLU>N</ItemBoolPLU>
  <TaxID>
    <TaxIDCode TaxStore="1" TaxIDType="Normal">0</TaxIDCode>
    <ComponentTax TxID="0" TxType="Normal" TxStartAmt="0.00" TxEndAmt="100000000.00" TxGlPayAcct=" ">0.000</ComponentTax>
  </TaxID>
  <ItemDepartment>
   <ItemDeptCode>APPAR</ItemDeptCode>
   <ItemDeptName>APPAR</ItemDeptName>
  </ItemDepartment>
  <ItemClass>
    <ItemClassCode>TEE</ItemClassCode>
   <ItemClassName>TEE-SHIRTS</ItemClassName>
  </ItemClass>
  <ItemSizeRun SizeRunCode="RUN" SizeRunName="">
    <SizeDef SizeLabel="">
      <Size SizeLabel="XS" Sequence="0">XS</Size>
      <Size SizeLabel="S" Sequence="1">S</Size>
      <Size SizeLabel="M" Sequence="2">M</Size>
      <Size SizeLabel="L" Sequence="3">L</Size>
      <Size SizeLabel="XL" Sequence="4">XL</Size>
      <Size SizeLabel="XXL" Sequence="5">XXL</Size>
    </SizeDef>
  </ItemSizeRun>
<ProductID PID="">
  <ItemColor ColorCode="N/A" ColorName="">
   <ItemSize MainSize="L">
      <Sku>400100018477</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="M">
      <Sku>400100018460</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="S">
      <Sku>400100018453</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XL">
      <Sku>400100018484</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XS">
      <Sku>400100031704</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>0.00</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XXL">
      <Sku>400100035801</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>0.00</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
  </ItemColor>
</ProductID>
</CWItem>
<CWItem action="New">
  <CWVendor>
    <VendorCode>5TH</VendorCode>
    <VendorName>5TH SUN</VendorName>
    <VendorAddress />
    <VendorAddress2 />
    <VendorCity />
    <VendorZip />
    <VendorPhone />
  </CWVendor>
  <ItemStyle>AMM025-B105</ItemStyle>
  <ItemDescription>WINGMAN</ItemDescription>
  <ItemBoolPLU>N</ItemBoolPLU>
  <TaxID>
    <TaxIDCode TaxStore="1" TaxIDType="Normal">0</TaxIDCode>
    <ComponentTax TxID="0" TxType="Normal" TxStartAmt="0.00" TxEndAmt="100000000.00" TxGlPayAcct=" ">0.000</ComponentTax>
  </TaxID>
  <ItemDepartment>
    <ItemDeptCode>APPAR</ItemDeptCode>
    <ItemDeptName>APPAR</ItemDeptName>
  </ItemDepartment>
  <ItemClass>
    <ItemClassCode>TEE</ItemClassCode>
    <ItemClassName>TEE-SHIRTS</ItemClassName>
  </ItemClass>
  <ItemSizeRun SizeRunCode="RUN" SizeRunName="">
    <SizeDef SizeLabel="">
      <Size SizeLabel="XS" Sequence="0">XS</Size>
      <Size SizeLabel="S" Sequence="1">S</Size>
      <Size SizeLabel="M" Sequence="2">M</Size>
      <Size SizeLabel="L" Sequence="3">L</Size>
      <Size SizeLabel="XL" Sequence="4">XL</Size>
      <Size SizeLabel="XXL" Sequence="5">XXL</Size>
    </SizeDef>
  </ItemSizeRun>
<ProductID PID="">
  <ItemColor ColorCode="N/A" ColorName="">
    <ItemSize MainSize="L">
      <Sku>400100018514</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="M">
      <Sku>400100018507</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="S">
      <Sku>400100018491</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XL">
      <Sku>400100018521</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>44.80</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XS">
      <Sku>400100031711</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>0.00</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
    <ItemSize MainSize="XXL">
      <Sku>400100035818</Sku>
      <Pricing Currency="USD">
        <ReplacementCost>44.80</ReplacementCost>
        <AverageCost>0.00</AverageCost>
        <LandedCost>0.00</LandedCost>
        <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
      </Pricing>
    </ItemSize>
  </ItemColor>
</ProductID>
</CWItem>

这是我第一次使用Nokogiri,所以我可能在这里做错了。

3 个答案:

答案 0 :(得分:1)

<强>问题

问题是使用//启动xpath。这表示将节点定位在文档中的任何位置。

在下面的简化示例中,您可以看到使用//导致返回相同的子项(而不是迭代项的子项)。

require 'nokogiri'

xml = %Q{
<root>
  <item>
    <subitem>1</subitem>
  </item>
  <item>
    <subitem>2</subitem>
  </item>  
</root>
}

doc = Nokogiri::XML(xml)
doc.xpath('//root//item').each_with_index do |item, i|
    puts item.at_xpath('//subitem').content
end
#=> 1
#=> 1

如果要查看特定节点内的任何位置,则需要以句点开头 - 即.//。将此应用于简化示例,您可以看到我们获得了预期的子项目结果:

doc = Nokogiri::XML(xml)
doc.xpath('//root//item').each_with_index do |item, i|
    puts item.at_xpath('.//subitem').content
end
#=> 1
#=> 2

<强>解决方案

对于您的特定问题,您应该更改项目迭代中的xpath以在开始时包含.。例如,行:

f << item.parent.parent.parent.at_xpath('//CWVendor//VendorCode').content + ", "

将更改为:

    f << item.parent.parent.parent.at_xpath('.//CWVendor//VendorCode').content + ", "

总的来说,这会给你:

doc.xpath('.//CWItemExport//ItemExportData//CWItem//ProductID//ItemColor//ItemSize').each_with_index do |item, i|
  f << item.parent.parent.parent.at_xpath('.//CWVendor//VendorCode').content + ", "
  f << item.parent.parent.parent.at_xpath('.//CWVendor//VendorName').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemStyle').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemDescription').content + ", "
  f << item.parent.parent.parent.at_xpath('.//TaxID//TaxIDCode').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemDepartment//ItemDeptCode').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemDepartment//ItemDeptName').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemClass//ItemClassCode').content + ", "
  f << item.parent.parent.parent.at_xpath('.//ItemClass//ItemClassName').content + ", "

  f << item.attr("MainSize") + ", "
  f << item.at_xpath('Sku').content + ", "
  f << item.at_xpath('.//ReplacementCost').content + ", "
  f << item.at_xpath('.//CurrentRetail').content + "\n"

  puts item.parent.parent.parent if i == 6

  break if i == 7
end

结果:

5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, L, 400100018477, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, M, 400100018460, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, S, 400100018453, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XL, 400100018484, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XS, 400100031704, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XXL, 400100035801, 44.80, 199.00
5TH, 5TH SUN, AMM025-B105, WINGMAN, 0, APPAR, APPAR, TEE, TEE-SHIRTS, L, 400100018514, 44.80, 199.00

注意:除非结构未知,否则我还建议使用单/而不是///检查直接子节点,如果得到意外结果,则可以更容易地进行调试。

答案 1 :(得分:1)

这有点像code review,但希望它能帮助您解决未来的问题。

  1. 如果您要创建CSV,我高度建议您使用CSV gem。它保证您不会创建损坏的CSV(您的代码会在其中一个字段包含逗号时创建)。此外,它允许您使用变量名称更具描述性。

  2. 正如其他人所提到的,如果您可以使用//,请不要使用/。除了在某些情况下不是你真正想要的东西之外,还有性能损失。

  3. item.parent.parent.parent的扩散告诉我你在路上走得太远了。利用XPath的谓词(在[]中)来确保您处于正确的级别。

  4. 此外,只要我们利用XPath,当你告诉XPath不要给你超过第一个7时,你就不需要索引或break。我没有'实现这一点是因为根据你的数据,我不确定你是否需要它。

  5. 示例:

    CSV.open("grammystext.csv", "wb") do |csv|
      items = doc.xpath('/CWItemExport/ItemExportData/CWItem[ProductID/ItemColor/ItemSize]')       items.each do |item|
        vendor_code = item.at_xpath('CWVendor/VendorCode').text
        vendor_name = item.at_xpath('CWVendor/VendorName').text
        item_style  = item.at_xpath('ItemStyle').text
        ...
        main_size = item.at_xpath('ProductID/ItemColor/ItemSize')['MainSize']
        sku       = item.at_xpath('ProductID/ItemColor/ItemSize/Sku').text
    
        csv << [vendor_code, vendor_name, item_style, ... , main_size, sku]
      end
    end
    

答案 2 :(得分:0)

您没有根元素CWItemExport和root的子元素:ItemExportData元素。如果您将XML更改为:

<CWItemExport>
    <ItemExportData>


        <CWItem action="New">
            <CWVendor>
                <VendorCode>5TH</VendorCode>

                <VendorName>5TH SUN</VendorName>
                <VendorAddress />
                <VendorAddress2 />
                <VendorCity />
                <VendorZip />
                <VendorPhone />
            </CWVendor>
            <ItemStyle>AMM024-B105</ItemStyle>
            <ItemDescription>CALVERY</ItemDescription>
            <ItemBoolPLU>N</ItemBoolPLU>
            <TaxID>
                <TaxIDCode TaxStore="1" TaxIDType="Normal">0</TaxIDCode>
                <ComponentTax TxID="0" TxType="Normal" TxStartAmt="0.00" TxEndAmt="100000000.00" TxGlPayAcct="
                    ">0.000</ComponentTax>
            </TaxID>
            <ItemDepartment>
                <ItemDeptCode>APPAR</ItemDeptCode>
                <ItemDeptName>APPAR</ItemDeptName>
            </ItemDepartment>
            <ItemClass>
                <ItemClassCode>TEE</ItemClassCode>
                <ItemClassName>TEE-SHIRTS</ItemClassName>
            </ItemClass>
            <ItemSizeRun SizeRunCode="RUN" SizeRunName="">
                <SizeDef SizeLabel="">
                    <Size SizeLabel="XS" Sequence="0">XS</Size>
                    <Size SizeLabel="S" Sequence="1">S</Size>
                    <Size SizeLabel="M" Sequence="2">M</Size>
                    <Size SizeLabel="L" Sequence="3">L</Size>
                    <Size SizeLabel="XL" Sequence="4">XL</Size>
                    <Size SizeLabel="XXL" Sequence="5">XXL</Size>
                </SizeDef>
            </ItemSizeRun>
            <ProductID PID="">
                <ItemColor ColorCode="N/A" ColorName="">
                    <ItemSize MainSize="L">
                        <Sku>400100018477</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="M">
                        <Sku>400100018460</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="S">
                        <Sku>400100018453</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XL">
                        <Sku>400100018484</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XS">
                        <Sku>400100031704</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>0.00</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XXL">
                        <Sku>400100035801</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>0.00</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                </ItemColor>
            </ProductID>
        </CWItem>
        <CWItem action="New">
            <CWVendor>
                <VendorCode>5TH</VendorCode>
                <VendorName>5TH SUN</VendorName>
                <VendorAddress />
                <VendorAddress2 />
                <VendorCity />
                <VendorZip />
                <VendorPhone />
            </CWVendor>
            <ItemStyle>AMM025-B105</ItemStyle>
            <ItemDescription>WINGMAN</ItemDescription>
            <ItemBoolPLU>N</ItemBoolPLU>
            <TaxID>
                <TaxIDCode TaxStore="1" TaxIDType="Normal">0</TaxIDCode>
                <ComponentTax TxID="0" TxType="Normal" TxStartAmt="0.00" TxEndAmt="100000000.00" TxGlPayAcct="
                    ">0.000</ComponentTax>
            </TaxID>
            <ItemDepartment>
                <ItemDeptCode>APPAR</ItemDeptCode>
                <ItemDeptName>APPAR</ItemDeptName>
            </ItemDepartment>
            <ItemClass>
                <ItemClassCode>TEE</ItemClassCode>
                <ItemClassName>TEE-SHIRTS</ItemClassName>
            </ItemClass>
            <ItemSizeRun SizeRunCode="RUN" SizeRunName="">
                <SizeDef SizeLabel="">
                    <Size SizeLabel="XS" Sequence="0">XS</Size>
                    <Size SizeLabel="S" Sequence="1">S</Size>
                    <Size SizeLabel="M" Sequence="2">M</Size>
                    <Size SizeLabel="L" Sequence="3">L</Size>
                    <Size SizeLabel="XL" Sequence="4">XL</Size>
                    <Size SizeLabel="XXL" Sequence="5">XXL</Size>
                </SizeDef>
            </ItemSizeRun>
            <ProductID PID="">
                <ItemColor ColorCode="N/A" ColorName="">
                    <ItemSize MainSize="L">
                        <Sku>400100018514</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="M">
                        <Sku>400100018507</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="S">
                        <Sku>400100018491</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XL">
                        <Sku>400100018521</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>44.80</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XS">
                        <Sku>400100031711</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>0.00</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                    <ItemSize MainSize="XXL">
                        <Sku>400100035818</Sku>
                        <Pricing Currency="USD">
                            <ReplacementCost>44.80</ReplacementCost>
                            <AverageCost>0.00</AverageCost>
                            <LandedCost>0.00</LandedCost>
                            <CurrentRetail MarkDowns=" ">199.00</CurrentRetail>
                        </Pricing>
                    </ItemSize>
                </ItemColor>
            </ProductID>
        </CWItem>
    </ItemExportData>
</CWItemExport>

您在grammystext.txt文件中获得以下内容:

5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, L, 400100018477, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, M, 400100018460, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, S, 400100018453, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XL, 400100018484, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XS, 400100031704, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, XXL, 400100035801, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, L, 400100018514, 44.80, 199.00
5TH, 5TH SUN, AMM024-B105, CALVERY, 0, APPAR, APPAR, TEE, TEE-SHIRTS, M, 400100018507, 44.80, 199.00

或者,您可以在阅读文件后添加所需的元素:

require 'nokogiri'

f = File.new("grammystext.txt", "w+")
x = File.open("items.xml", "r")
xml = "<CWItemExport><ItemExportData>#{x.read}</CWItemExport></ItemExportData>"
doc = Nokogiri::XML(xml)
x.close

doc.xpath('//CWItemExport//ItemExportData//CWItem//ProductID//ItemColor//ItemSize').each_with_index do |item, i|
  f << item.parent.parent.parent.at_xpath('//CWVendor//VendorCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//CWVendor//VendorName').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemStyle').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDescription').content + ", "
  f << item.parent.parent.parent.at_xpath('//TaxID//TaxIDCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDepartment//ItemDeptCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemDepartment//ItemDeptName').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemClass//ItemClassCode').content + ", "
  f << item.parent.parent.parent.at_xpath('//ItemClass//ItemClassName').content + ", "

  f << item.attr("MainSize") + ", "
  f << item.at_xpath('Sku').content + ", "
  f << item.at_xpath('//ReplacementCost').content + ", "
  f << item.at_xpath('//CurrentRetail').content + "\n"

  puts item.parent.parent.parent if i == 6

  break if i == 7
end

f.close