我正在使用SAX解析XML并且到目前为止很好,但是这个xml是一个名为rowset的微软架构,因此每列都被命名为c0,c1,c2等等,但是列的实际名称是存储的在架构中,它位于同一XML文件中。
请参阅以下示例:
<?xml version="1.0" encoding="utf-8"?>
<RS xmlns="urn:schemas-microsoft-com:xml-analysis:rowset">
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:saw-sql="urn:saw-sql" targetNamespace="urn:schemas-microsoft-com:xml-analysis:rowset">
<xsd:complexType name="R">
<xsd:sequence>
<xsd:element name="C0" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Opportunity ID" saw-sql:columnID="c55007aeb4fd2c909"/>
<xsd:element name="C1" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Opportunity Name" saw-sql:columnID="c9e466deb3ad5a949"/>
<xsd:element name="C2" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Closed Flag" saw-sql:columnID="c4f1b71539820e65c"/>
<xsd:element name="C3" type="xsd:dateTime" minOccurs="0" maxOccurs="1" saw-sql:type="timestamp" saw-sql:columnHeading="Actual Close Date" saw-sql:columnID="c007ad0eb3ebae22d"/>
<xsd:element name="C4" type="xsd:dateTime" minOccurs="0" maxOccurs="1" saw-sql:type="timestamp" saw-sql:columnHeading="Creation Date" saw-sql:columnID="c8bf417f59e742c63"/>
<xsd:element name="C5" type="xsd:double" minOccurs="0" maxOccurs="1" saw-sql:type="double" saw-sql:columnHeading="Line Win Probability" saw-sql:columnID="c49c9eb926449bccc"/>
<xsd:element name="C6" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="First Name" saw-sql:columnID="c7c2e5dc4832c9e81"/>
<xsd:element name="C7" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Last Name" saw-sql:columnID="c0fc239ab56275027"/>
<xsd:element name="C8" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Base Resource Name" saw-sql:columnID="c7c53a4daa37f1637"/>
<xsd:element name="C9" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Level - 1 Resource Name" saw-sql:columnID="ccb4a337a024e6838"/>
<xsd:element name="C10" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Level - 2 Resource Name" saw-sql:columnID="cfbde2b815500dd87"/>
<xsd:element name="C11" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Level - 3 Resource Name" saw-sql:columnID="cef1347354936a6a8"/>
<xsd:element name="C12" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Level - 4 Resource Name" saw-sql:columnID="cbdc3fbd438cafb0f"/>
<xsd:element name="C13" type="xsd:string" minOccurs="1" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Current Top-Level Resource Name" saw-sql:columnID="c7d7bc50e00cb94ae"/>
<xsd:element name="C14" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Name" saw-sql:columnID="c08c8bd262c9cfaac"/>
<xsd:element name="C15" type="xsd:int" minOccurs="0" maxOccurs="1" saw-sql:type="integer" saw-sql:columnHeading="# Days in Stage" saw-sql:columnID="c5b0dbd396ffe3a62"/>
<xsd:element name="C16" type="xsd:double" minOccurs="0" maxOccurs="1" saw-sql:type="double" saw-sql:columnHeading="Revenue" saw-sql:columnID="cdc65489635ccdda0"/>
<xsd:element name="C17" type="xsd:dateTime" minOccurs="0" maxOccurs="1" saw-sql:type="timestamp" saw-sql:columnHeading="Close Date" saw-sql:columnID="c4bd9fe19ff33072d"/>
<xsd:element name="C18" type="xsd:string" minOccurs="0" maxOccurs="1" saw-sql:type="varchar" saw-sql:columnHeading="Opportunity Number" saw-sql:columnID="c6429e0148a929b79"/>
</xsd:sequence>
</xsd:complexType>
</xsd:schema>
<R>
<C0>100000009829205</C0>
<C1>Upgrading Network Servers</C1>
<C2>Y</C2>
<C3>2013-01-16T00:00:00</C3>
<C4>2012-05-01T00:00:00</C4>
<C5>100</C5>
<C6>Lisa</C6>
<C7>Jones</C7>
<C8>Pepe papas</C8>
<C9>Pepe papas</C9>
<C10>Pepe papas</C10>
<C11>Pepe papas</C11>
<C12>Pepe papas</C12>
<C13>Perro Hunter</C13>
<C14>Amazing customer (SEATTLE, US)</C14>
<C15></C15>
<C16></C16>
<C17></C17>
<C18>RUP3-Opty-1</C18>
</R>
...
您可以看到列的实际名称存储在架构的saw-sql:columnHeading属性中,我可以将name属性与colum tag-name匹配,但我不知道如何访问架构来自SAX处理程序,
到目前为止,我有这个try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
System.out.println("Colum :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
String value = new String(ch, start, length);
System.out.println("value: "+ value);
}
};
InputSource is = new InputSource(new StringReader(xmlContent));
is.setEncoding("UTF-8");
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
有什么想法吗?
答案 0 :(得分:1)
SAX处理程序将按文档顺序记录startElement
个事件。第一个是RS
,然后是xsd:schema
,然后是xsd:complexType
等。您需要跟踪每个事件并编写可以解决表所在位置的逻辑。 SAX是一个低级解析器,默认情况下不保留文档Structure。
我希望你有一个索引的本地数据结构(可能是一个表)用于模式,并且每列都要添加一个条目。然后对于读取的每一行,您将在模式中查找详细信息(例如,通过getColumn(“C1”)等。)
对于小问题,您可能希望在将XPath用于使引用易于编写时将整个文档读入DOM(我使用XOM)。