你将用什么路径来解析一个没有架构的大型XML文件(2MB-20 MB或更多)(由于文件结构很奇怪,我无法用XSD.exe推断一个,请查看下面的代码段) ?
选项
1)XML反序列化(但如上所述,我没有架构和XSD工具抱怨文件内容), 2)Linq to XML, 3)加载到XmlDocument中, 4)使用XmlReader& amp;东西。
这是XML文件片段:
<?xml version="1.0" encoding="utf-8"?>
<xmlData date="29.04.2010 12:09:13">
<Table>
<ident>079186</ident>
<stock>0</stock>
<pricewotax>33.94000000</pricewotax>
<discountpercent>0.00000000</discountpercent>
</Table>
<Table>
<ident>079190</ident>
<stock>1</stock>
<pricewotax>10.50000000</pricewotax>
<discountpercent>0.00000000</discountpercent>
<pricebyquantity>
<Table>
<quantity>5</quantity>
<pricewotax>10.00000000</pricewotax>
<discountpercent>0.00000000</discountpercent>
</Table>
<Table>
<quantity>8</quantity>
<pricewotax>9.00000000</pricewotax>
<discountpercent>0.00000000</discountpercent>
</Table>
</pricebyquantity>
</Table>
</xmlData>
答案 0 :(得分:0)
我会将其加载到XmlDocument
中,然后使用XPath进行相应的处理。 LINQ可能是最好的选择,但我不是很熟悉它,所以我不能说。
答案 1 :(得分:0)
这是XSD:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="xmlData">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Table">
<xs:complexType>
<xs:sequence>
<xs:element name="ident" type="xs:int" />
<xs:element name="stock" type="xs:int" />
<xs:element name="pricewotax" type="xs:double" />
<xs:element name="discountpercent" type="xs:double" />
<xs:element minOccurs="0" name="pricebyquantity">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Table">
<xs:complexType>
<xs:sequence>
<xs:element name="quantity" type="xs:int" />
<xs:element name="pricewotax" type="xs:double" />
<xs:element name="discountpercent" type="xs:double" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="date" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
</xs:schema>
这是可序列化的类:
//------------------------------------------------------------------------------
// <auto-generated>
// This code was generated by a tool.
// Runtime Version:2.0.50727.3603
//
// Changes to this file may cause incorrect behavior and will be lost if
// the code is regenerated.
// </auto-generated>
//------------------------------------------------------------------------------
//
// This source code was auto-generated by xsd, Version=2.0.50727.1432.
//
namespace StockInfo {
using System.Xml.Serialization;
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.1432")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
[System.Xml.Serialization.XmlRootAttribute(Namespace="", IsNullable=false)]
public partial class xmlData {
private xmlDataTable[] tableField;
private string dateField;
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("Table")]
public xmlDataTable[] Table {
get {
return this.tableField;
}
set {
this.tableField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute()]
public string date {
get {
return this.dateField;
}
set {
this.dateField = value;
}
}
}
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.1432")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
public partial class xmlDataTable {
private int identField;
private int stockField;
private double pricewotaxField;
private double discountpercentField;
private xmlDataTableTable[] pricebyquantityField;
/// <remarks/>
public int ident {
get {
return this.identField;
}
set {
this.identField = value;
}
}
/// <remarks/>
public int stock {
get {
return this.stockField;
}
set {
this.stockField = value;
}
}
/// <remarks/>
public double pricewotax {
get {
return this.pricewotaxField;
}
set {
this.pricewotaxField = value;
}
}
/// <remarks/>
public double discountpercent {
get {
return this.discountpercentField;
}
set {
this.discountpercentField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlArrayItemAttribute("Table", IsNullable=false)]
public xmlDataTableTable[] pricebyquantity {
get {
return this.pricebyquantityField;
}
set {
this.pricebyquantityField = value;
}
}
}
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.1432")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true)]
public partial class xmlDataTableTable {
private int quantityField;
private double pricewotaxField;
private double discountpercentField;
/// <remarks/>
public int quantity {
get {
return this.quantityField;
}
set {
this.quantityField = value;
}
}
/// <remarks/>
public double pricewotax {
get {
return this.pricewotaxField;
}
set {
this.pricewotaxField = value;
}
}
/// <remarks/>
public double discountpercent {
get {
return this.discountpercentField;
}
set {
this.discountpercentField = value;
}
}
}
}
有一点需要注意:反序列化可能不是解析20MB文件最高效的方法。 XmlReader可能是最快的方法,但这意味着手动完成。