Nokogiri XPath没有找到某些节点

时间:2012-12-02 15:08:35

标签: ruby xml nokogiri

我正在使用Nokogiri修改现有的XML,但我在选择某些节点时遇到了问题。

以下是XML的相关摘录:

<ProductCatalog>
  <ProductLineItem>
    <updi:ProductIdentification>
      <updi:ProductName>800-22283-03</updi:ProductName>

我可以找到下面两个节点:

doc.xpath("//updi:ProductIdentification") => #<Nokogiri::XML...
doc.xpath("//updi:ProductName") => #<Nokogiri::XML...

但是,如果我尝试选择其中一个上层节点:

doc.xpath("//ProductLineItem") => []

我找回一个空数组。它似乎与前缀有关。我可以找到任何具有前缀的元素,但找不到没有前缀的元素。

更新:这是(相当冗长的)命名空间:

xsi:schemaLocation="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00 ..\..\XML\Interchange\ProductCatalogInformationDistribution_01_00.xsd"
xmlns:dplcs="urn:rosettanet:specification:domain:Design:ProductLifeCycleStatusCode:xsd:codelist:01.03"
xmlns:rrt="urn:rosettanet:specification:domain:Shared:RateType:xsd:codelist:01.01" 
xmlns:dl="urn:rosettanet:specification:domain:Logistics:xsd:schema:02.15" 
xmlns:ictc="urn:rosettanet:specification:domain:Design:CatalogType:xsd:codelist:01.00" 
xmlns:updi="urn:rosettanet:specification:universal:ProductIdentification:xsd:schema:01.04" 
xmlns:dddt="urn:rosettanet:specification:domain:Design:DateType:xsd:codelist:01.00" 
xmlns:dsdc="urn:rosettanet:specification:domain:Logistics:ShipDateCode:xsd:codelist:01.03" 
xmlns:ucr="urn:rosettanet:specification:universal:Currency:xsd:codelist:01.02" 
xmlns:dpiac="urn:rosettanet:specification:domain:Logistics:PortIdentifierAuthorityCode:xsd:codelist:01.03" 
xmlns:rptc="urn:rosettanet:specification:domain:Shared:PricingTypeCode:xsd:codelist:01.03" 
xmlns:dit="urn:rosettanet:specification:domain:Procurement:InventoryType:xsd:codelist:01.03" 
xmlns:dtt="urn:rosettanet:specification:domain:Procurement:TransactionType:xsd:codelist:01.04" 
xmlns:upd="urn:rosettanet:specification:universal:PhysicalDimension:xsd:schema:01.05" 
xmlns:dcst="urn:rosettanet:specification:domain:Logistics:CustomsType:xsd:codelist:01.03" 
xmlns:dsd="urn:rosettanet:specification:domain:Logistics:ShippingDocument:xsd:codelist:01.02" 
xmlns:uci="urn:rosettanet:specification:universal:ContactInformation:xsd:schema:01.03" 
xmlns:dpcm="urn:rosettanet:specification:domain:Procurement:PurchaseMethod:xsd:codelist:01.03" 
xmlns:rpsc="urn:rosettanet:specification:domain:Shared:ProductStatusCode:xsd:codelist:01.01" 
xmlns:dgrc="urn:rosettanet:specification:domain:Marketing:GeographicRegionCode:xsd:codelist:01.02" 
xmlns:dtrt="urn:rosettanet:specification:domain:Logistics:TrackingReferenceType:xsd:codelist:01.06" 
xmlns:umtq="urn:rosettanet:specification:universal:MimeTypeQualifier:xsd:codelist:01.02" 
xmlns:dcrt="urn:rosettanet:specification:domain:Procurement:CustomerType:xsd:codelist:01.03" 
xmlns:dscd="urn:rosettanet:specification:domain:Logistics:ShipmentChangeDisposition:xsd:codelist:01.03" 
xmlns:uc="urn:rosettanet:specification:universal:Country:xsd:codelist:01.02" 
xmlns="urn:rosettanet:specification:interchange:ProductCatalogInformationDistribution:xsd:schema:01.00" 
xmlns:dpc="urn:rosettanet:specification:domain:Procurement:PaymentCondition:xsd:codelist:01.03" 
xmlns:rpmt="urn:rosettanet:specification:domain:Shared:PaymentType:xsd:codelist:01.01" 
xmlns:dft="urn:rosettanet:specification:domain:Procurement:FinanceTerms:xsd:codelist:01.03" 
xmlns:dtq="urn:rosettanet:specification:domain:Procurement:TotalQualifier:xsd:codelist:01.03" 
xmlns:ume="urn:rosettanet:specification:universal:MonetaryExpression:xsd:schema:01.04" 
xmlns:dcp="urn:rosettanet:specification:domain:Design:Compliant:xsd:codelist:01.02" 
xmlns:drsc="urn:rosettanet:specification:domain:Marketing:RegistrationStatusCode:xsd:codelist:01.03" 
xmlns:uat="urn:rosettanet:specification:universal:AbstractType:xsd:schema:01.02" 
xmlns:dp="urn:rosettanet:specification:domain:Procurement:xsd:schema:02.17" 
xmlns:rpm="urn:rosettanet:specification:domain:Shared:PaymentMethod:xsd:codelist:01.02" 
xmlns:dfrt="urn:rosettanet:specification:domain:Procurement:ForecastReferenceType:xsd:codelist:01.03" 
xmlns:dtec="urn:rosettanet:specification:domain:Procurement:TaxExemptionCode:xsd:codelist:01.03" 
xmlns:ulc="urn:rosettanet:specification:universal:Locations:xsd:schema:01.04" 
xmlns:dccc="urn:rosettanet:specification:domain:Procurement:CreditCardClassification:xsd:codelist:01.03" 
xmlns:drlc="urn:rosettanet:specification:domain:Logistics:ReturnLabelCode:xsd:codelist:01.03" 
xmlns:st="http://www.ascc.net/xml/schematron" 
xmlns:dnecc="urn:rosettanet:specification:domain:Logistics:NationalExportControlClassification:xsd:codelist:01.03" 
xmlns:rpktc="urn:rosettanet:specification:domain:Shared:PackageTypeCode:xsd:codelist:01.01" 
xmlns:uwt="urn:rosettanet:specification:universal:WeightType:xsd:codelist:01.01" 
xmlns:dfpt="urn:rosettanet:specification:domain:Logistics:FreightPaymentTerms:xsd:codelist:01.03" 
xmlns:dte="urn:rosettanet:specification:domain:Procurement:TransportEvent:xsd:codelist:01.03" 
xmlns:ul="urn:rosettanet:specification:universal:Language:xsd:codelist:01.02" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:dbpq="urn:rosettanet:specification:domain:Procurement:BookPriceQualifier:xsd:codelist:01.04" 
xmlns:drl="urn:rosettanet:specification:domain:Logistics:RouteLocation:xsd:codelist:01.03" 
xmlns:ssdh="urn:rosettanet:specification:system:StandardDocumentHeader:xsd:schema:01.16" 
xmlns:dmk="urn:rosettanet:specification:domain:Marketing:xsd:schema:02.12" 
xmlns:rmat="urn:rosettanet:specification:domain:Shared:MonetaryAmountType:xsd:codelist:01.01" 
xmlns:uuom="urn:rosettanet:specification:universal:UnitOfMeasure:xsd:codelist:01.03" 
xmlns:dfe="urn:rosettanet:specification:domain:Procurement:ForecastEvent:xsd:codelist:01.03" 
xmlns:dst="urn:rosettanet:specification:domain:Procurement:ShipmentTerms:xsd:codelist:01.03" 
xmlns:udt="urn:rosettanet:specification:universal:DataType:xsd:schema:01.04" 
xmlns:dacc="urn:rosettanet:specification:domain:Procurement:AccountClassification:xsd:codelist:01.03" 
xmlns:dptt="urn:rosettanet:specification:domain:Logistics:PortType:xsd:codelist:01.03" 
xmlns:sha="urn:rosettanet:specification:domain:Shared:xsd:schema:01.10" 
xmlns:dlv="urn:rosettanet:specification:domain:Design:Level:xsd:codelist:01.02" 
xmlns:rict="urn:rosettanet:specification:domain:Shared:InvoiceChargeType:xsd:codelist:01.02" 
xmlns:utt="urn:rosettanet:specification:universal:TaxType:xsd:codelist:01.02" 
xmlns:ddwsr="urn:rosettanet:specification:domain:Marketing:DesignWinStatusReason:xsd:codelist:01.03" 
xmlns:dsm="urn:rosettanet:specification:domain:Logistics:ShipmentMode:xsd:codelist:01.05" 
xmlns:udct="urn:rosettanet:specification:universal:DocumentType:xsd:codelist:01.09" 
xmlns:dac="urn:rosettanet:specification:domain:Design:ActionCode:xsd:codelist:01.03" 
xmlns:dpsr="urn:rosettanet:specification:domain:Procurement:ProductSubstitutionReason:xsd:codelist:01.03" 
xmlns:sft="urn:rosettanet:specification:system:TPIRFileType:xsd:codelist:01.01" 
xmlns:dltcc="urn:rosettanet:specification:domain:Procurement:LeadTimeClassificationCode:xsd:codelist:01.03" 
xmlns:ri="urn:rosettanet:specification:domain:Shared:Interval:xsd:codelist:01.01" 
xmlns:urss="urn:rosettanet:specification:system:xml:1.0" 
xmlns:dds="urn:rosettanet:specification:domain:Design:xsd:schema:02.15" 
xmlns:dslt="urn:rosettanet:specification:domain:Procurement:SaleType:xsd:codelist:01.04" 
xmlns:udc="urn:rosettanet:specification:universal:Document:xsd:schema:01.08" 
xmlns:dabcc="urn:rosettanet:specification:domain:Design:ABCCode:xsd:codelist:01.02" 
xmlns:dppt="urn:rosettanet:specification:domain:Procurement:ProductProcurementType:xsd:codelist:01.03" 
xmlns:rwtc="urn:rosettanet:specification:domain:Shared:WarrantyType:xsd:codelist:01.01" 
xmlns:dlit="urn:rosettanet:specification:domain:Logistics:InstructionType:xsd:codelist:01.00" 
xmlns:rfob="urn:rosettanet:specification:domain:Shared:FreeOnBoard:xsd:codelist:01.01" 
xmlns:upri="urn:rosettanet:specification:universal:ProcessRoleIdentifier:xsd:codelist:01.08" 
xmlns:ddrn="urn:rosettanet:specification:domain:Marketing:DesignRegistrationNotification:xsd:codelist:01.02" 
xmlns:dsh="urn:rosettanet:specification:domain:Procurement:SpecialHandling:xsd:codelist:01.04" 
xmlns:ud="urn:rosettanet:specification:universal:Dates:xsd:schema:01.03" 
xmlns:dpms="urn:rosettanet:specification:domain:Marketing:ProjectMarketSegment:xsd:codelist:01.02" 
xmlns:rssl="urn:rosettanet:specification:domain:Shared:ShippingServiceLevel:xsd:codelist:01.01" 
xmlns:dldr="urn:rosettanet:specification:domain:Logistics:LotDiscrepancyReason:xsd:codelist:01.03" 
xmlns:rat="urn:rosettanet:specification:domain:Shared:AmountType:xsd:codelist:01.02" 
xmlns:upi="urn:rosettanet:specification:universal:PartnerIdentification:xsd:schema:01.12" 
xmlns:ddp="urn:rosettanet:specification:domain:Marketing:Disposition:xsd:codelist:01.02" 
xmlns:dsfr="urn:rosettanet:specification:domain:Procurement:SpecialFulfillmentRequest:xsd:codelist:01.03" 
xmlns:ucs="urn:rosettanet:specification:universal:CountrySubdivision:xsd:codelist:01.02

1 个答案:

答案 0 :(得分:8)

最简单的快速解决方案是从文档中删除名称空间:

require 'nokogiri'
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>"

p xml.xpath('//b').length     #=> 0
p xml.xpath('//bar:b').length #=> 1
p xml.xpath('//a').length     #=> 0
xml.remove_namespaces!
p xml.xpath('//a').length     #=> 1
p xml.xpath('//b').length     #=> 1

但是,如果您需要保留名称空间(例如,修改文档并重新保存文档,或者在各种名称空间中存在冲突的元素或属性名称),则上述内容不是有效的解决方案。如果您无法核对命名空间,您可以创建一个前缀并告诉Nokogiri它对应的内容......

xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'><a/><bar:b /></root>"
p xml.xpath('//x:a','x'=>'foo').length  #=> 1

...其中字符串foo是文档中拥有元素的命名空间的URI,它具有默认命名空间(通常位于根目录下),字符串x是您想要的任何内容将(与您文档中已声明的另一个命名空间不冲突)。或者,更简单地说,您可以使用xmlns作为默认命名空间的前缀:

p xml.xpath('//xmlns:a').length  #=> 1

或者,如果您需要保留名称空间并且可以构造一个合理的CSS样式选择器来获取您需要的节点,那么您可以使用css方法:

require 'nokogiri'
xml = Nokogiri.XML "<root xmlns='foo' xmlns:bar='whee'>
  <a/>
  <bar:b />
  <c xmlns='jim'><d/></c>
</root>"

p xml.css('a').length, #=> 1
  xml.css('b').length, #=> 0
  xml.css('c').length, #=> 0
  xml.css('d').length  #=> 0

如上所示,请注意,这仅适用于与根元素位于同一命名空间的节点。