我正在处理非结构化XML文档,以便将其转换为结构化文档。非结构化文档如下所示
<?xml version="1.0" encoding="UTF-8"?>
<CustomerInformation>
<CustomerPurchaseID>String</CustomerPurchaseID>
<MemberAddress>String</MemberAddress>
<MemberID>String</MemberID>
<MemberCity>String</MemberCity>
<MemberName>String</MemberName>
<MemberType>String</MemberType>
<MemberState>String</MemberState>
<MemberSince>String</MemberSince>
<PurchaseDate>String</PurchaseDate>
<CreditCardName></CreditCardName>
<CreditCardExpirration></CreditCardExpirration>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<Orders>
<LineItemCode>String</LineItemCode>
<LineItemID>String</LineItemID>
<LineItemDescription>String</LineItemDescription>
<DiscountCode>String</DiscountCode>
</Orders>
<ShipToAddress>String</ShipToAddress>
<ShipToCity>String</ShipToCity>
<ShipToFirstName>String</ShipToFirstName>
<ShipToLastName>String</ShipToLastName>
<ShipToState>String</ShipToState>
<ShipToZIPCode>String</ShipToZIPCode>
<CustomerAddressLine1>String</CustomerAddressLine1>
<CustomerAddressLine2>String</CustomerAddressLine2>
<CustomerID>String</CustomerID>
<CustomerCity>String</CustomerCity>
<CustomerEmail>String</CustomerEmail>
<CustomerFirstName>String</CustomerFirstName>
<CustomerLastName>String</CustomerLastName>
<CustomerHomePhone>String</CustomerHomePhone>
<CustomerState>String</CustomerState>
<CustomerZIP>String</CustomerZIP>
<Status>String</Status>
<OrderedFromName>String</OrderedFromName>
<CustomerIdentification></CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<OrderedFromAddressLine1Text>String</OrderedFromAddressLine1Text>
<OrderedFromAddressLine2Text>String</OrderedFromAddressLine2Text>
<OrderedFromCityName>String</OrderedFromCityName>
<OrderedFromStateCode>String</OrderedFromStateCode>
<OrderedFromZip5Code>String</OrderedFromZip5Code>
<OrderedFromZip4Code>String</OrderedFromZip4Code>
</CustomerInformation>
将其转换为以下内容:
<?xml version="1.0" encoding="UTF-8"?>
<xmlns:evt="http://www.metadata..com/Management/">
<Identifier>3442=000-MNNN</Identifier>
<TypeCode>Purchase History</TypeCode>
<TypeDescription>Order Summary</TypeDescription>
<PurposeCode>Invoice</PurposeCode>
<Member>
<Email>String</Email>
<MemberSince>03/23/2000</MemberSince>
<MemberType>
<MemberShipTypeCode>String</MemberShipTypeCode>
<TypeDescription>String</TypeDescription>
</MemberType>
<Address>
<AddressLine1Text>String</AddressLine1Text>
<AddressLine2Text>String</AddressLine2Text>
<CityName>String</CityName>
<StateCode>String</StateCode>
<Zip5Code>String</Zip5Code>
<Zip4Code>String</Zip4Code>
</Address>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
</Member>
<Company>
<CompanyName>String</CompanyName>
<CustomerIdentification>0.0</CustomerIdentification>
<PrimaryCustomerIndicator>String</PrimaryCustomerIndicator>
<CompanyAddress>
<CompanyAddressLine1Text>String</CompanyAddressLine1Text>
<CompanyAddressLine2Text>String</CompanyAddressLine2Text>
<CompanyCityName>String</CompanyCityName>
<CompanyStateCode>String</CompanyStateCode>
<CompanyZip5Code>String</CompanyZip5Code>
<CompanyZip4Code>String</CompanyZip4Code>
</CompanyAddress>
</Company>
<Orders>
<CreditCard>
<CardName>String</CardName>
<CardExpirationDate>1967-08-13</CardExpirationDate>
</CreditCard>
<Order>
<Discount>String</Discount>
<ShippingVendorName>String</ShippingVendorName>
<ShipmentTrackingNumber>String</ShipmentTrackingNumber>
<ShipmentTrackingLinkText>String</ShipmentTrackingLinkText>
<CustomerName>String</CustomerName>
<CustomerEmailAddressText>String</CustomerEmailAddressText>
<Telephone>
<AreaCode>String</AreaCode>
<TelephoneNumber>String</TelephoneNumber>
</Telephone>
<ShippingAddress>
<ShippingAddressLine1Text>String</ShippingAddressLine1Text>
<ShippingAddressLine2Text>String</ShippingAddressLine2Text>
<ShippingCareOfText>String</ShippingCareOfText>
<ShippingCityName>String</ShippingCityName>
<ShippingStateCode>String</ShippingStateCode>
<ShippingZip5Code>String</ShippingZip5Code>
<ShippingZip4Code>String</ShippingZip4Code>
</ShippingAddress>
<LineItem>
<LineItemNumber>String</LineItemNumber>
<LineItemQuantityCount>0</LineItemQuantityCount>
<ItemOrderedIndicator>String</ItemOrderedIndicator>
<Discount>String</Discount>
</LineItem>
</Order>
</Orders>
我能够通过创建结构化格式生成XML,并通过简单地使用下面的XSLT节点值来提取相关字段:
<xsl:value-of select=.../>
然而,我觉得可能有更好的方法来做到这一点。我希望能够在导航非结构化或平面文档时控制结构的生成方式。例如,有没有办法对所有MemberAddress字段的元素进行分组?如果我能够这样做,我可以创建输出的成员部分。我也可以为其他元素做同样的事情。我对结构化文档进行硬编码的担忧是它将来可能会发生变化。如果可能的话,我更愿意控制输出。源文档中的所有成员信息应映射到目标文档中的成员元素。源文档中以OrderedFrom开头的元素应映射到目标文档中的Company字段。反过来,ShipTo元素应映射到目标文档的订单部分中的出货信息,依此类推。请帮忙!!
答案 0 :(得分:1)
我对结构化文档进行硬编码的担心是它可能 改变未来。
XSLT样式表将数据从一个XML模式转换为另一个XML模式。期望任一模式中的更改都不需要重写样式表是不现实的。
有没有办法为所有MemberAddress字段分组元素 示例
是的,如果你有办法识别它们。例如,你可以这样做:
<Member>
<xsl:for-each select="*[starts-with(name(), 'Member')]">
<xsl:element name="{substring-after(name(), 'Member')}">
<xsl:value-of select="." />
</xsl:element>
</xsl:for-each>
</Member>
得到:
<Member>
<Address>String</Address>
<ID>String</ID>
<City>String</City>
<Name>String</Name>
<Type>String</Type>
<State>String</State>
<Since>String</Since>
</Member>
但这不符合您的预期输出。顺便说一句,您的输出显示了很多输入中没有的数据,例如会员的电子邮件。