如何解组包含DIFFGR的XML代码

时间:2013-01-09 11:20:44

标签: java xml jaxb unmarshalling

JAXB新手,我正在尝试解组 XML文档。我使用xjc命令从XSD文件构建DataSet和ObjectFactory:

<xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="NewDataSet">
    <xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:UseCurrentLocale="true">
        <xs:complexType>
            <xs:choice minOccurs="0" maxOccurs="unbounded">
                <xs:element name="Table">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="AUTHOR" type="xs:string" minOccurs="0"/>
                            <xs:element name="TITLE" type="xs:string" minOccurs="0"/>
                            <xs:element name="ISBN" type="xs:string" minOccurs="0"/>
                        </xs:sequence>
                    </xs:complexType>
                </xs:element>
            </xs:choice>
        </xs:complexType>
    </xs:element>
</xs:schema>

生成的NewDataSet类如下:

package generated;

import java.util.ArrayList;
import java.util.List;
import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;


@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "table"
})
@XmlRootElement(name = "NewDataSet")
public class NewDataSet {

    @XmlElement(name = "Table")
    protected List<NewDataSet.Table> table;

    public List<NewDataSet.Table> getTable() {
        if (table == null) {
            table = new ArrayList<NewDataSet.Table>();
        }
        return this.table;
    }


    @XmlAccessorType(XmlAccessType.FIELD)
    @XmlType(name = "", propOrder = {
        "author",
        "title",
        "isbn"
    })
    public static class Table {

        @XmlElement(name = "AUTHOR")
        protected String author;
        @XmlElement(name = "TITLE")
        protected String title;
        @XmlElement(name = "ISBN")
        protected String isbn;

        public String getAUTHOR() {
            return author;
        }

        public void setAUTHOR(String value) {
            this.author = value;
        }

        public String getTITLE() {
            return title;
        }
        public void setTITLE(String value) {
            this.title = value;
        }

        public String getISBN() {
            return isbn;
        }

        public void setISBN(String value) {
            this.isbn = value;
        }

    }

}

ObjectFactory是:

package generated;

import javax.xml.bind.annotation.XmlRegistry;


@XmlRegistry
public class ObjectFactory {

    public ObjectFactory() {
    }

    public NewDataSet createNewDataSet() {
        return new NewDataSet();
    }

    public NewDataSet.Table createNewDataSetTable() {
        return new NewDataSet.Table();
    }

}

我想要解组的XML文件如下:

<NewDataSet xmlns="">
    <diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
        <Table diffgr:id="Table1" msdata:rowOrder="0">
            <AUTHOR>Kubo Tite</AUTHOR>
            <TITLE>Bleach</TITLE>
            <ISBN>1234456</ISBN>
        </Table>
        <Table diffgr:id="Table2" msdata:rowOrder="2">
            <AUTHOR>Masashi Kishimoto</AUTHOR>
            <TITLE>Naruto</TITLE>
            <ISBN>435345</ISBN>
        </Table>
        <Table diffgr:id="Table3" msdata:rowOrder="3">
            <AUTHOR>Eiichiro Oda</AUTHOR>
            <TITLE>One Piece</TITLE>
            <ISBN>56767</ISBN>
        </Table>
    </diffgr:diffgram>
</NewDataSet>

执行解组的代码如下:

package consume;

import generated.NewDataSet;
import generated.NewDataSet.Table;

import java.io.File;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;

public class UnmarshallingCode {
    public static void main(String[] args) {
        try{
            File file=new File("data.xml");
            JAXBContext jb=JAXBContext.newInstance(NewDataSet.class);

            Unmarshaller unmarshaller=jb.createUnmarshaller();
            NewDataSet newDataSet=(NewDataSet)unmarshaller.unmarshal(file);
            for(Table t: newDataSet.getTable()){
                System.out.println(t.getTITLE());
            }
        }catch(JAXBException e){
            e.printStackTrace();
        }
    }
}

上述代码不会产生任何错误,但不会产生任何结果。调用newDataSet.getTable()返回一个空列表。

但是,如果我从XML文件中删除<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">及其相关属性,则上述代码运行正常并生成XML文档中的所有标题。但这不是一个允许的解决方案,因为XML文档非常大,很可能我不允许对结构进行大的更改。如何解组上述XML文件?请指教。

1 个答案:

答案 0 :(得分:10)

您尝试解组的XML文档不符合您的XML架构。这就是为什么您的JAXB (JSR-222)实施不会按预期引入数据的原因,以及删除diffgr元素时的原因。

删除额外元素

您可以创建一个过滤的XMLStreamReader,删除多余的元素,然后从中解组。

XMLInputFactory xif = XMLInputFactory.newFactory();
StreamSource xml = new StreamSource("src/forum14234091/input.xml");
XMLStreamReader xsr = xif.createXMLStreamReader(xml);

xsr = xif.createFilteredReader(xsr, new StreamFilter() {
    @Override
    public boolean accept(XMLStreamReader xsr) {
        if(xsr.isStartElement() || xsr.isEndElement()) {
            return !"urn:schemas-microsoft-com:xml-diffgram-v1".equals(xsr.getNamespaceURI());
        }
        return true;
    }
});    

创建JAXBContext

当您从XML模式生成JAXB模型时,您应该在生成的类的包名称上创建JAXBContext而不是根类。这将确保处理所有生成的工件。

JAXBContext jb=JAXBContext.newInstance("generated");

Unmarshaller unmarshaller=jb.createUnmarshaller();
NewDataSet newDataSet=(NewDataSet)unmarshaller.unmarshal(xsr);
for(Table t: newDataSet.getTable()){
    System.out.println(t.getTITLE());
}

更改包名称

默认情况下,生成的类的包名称基于targetNamespace,如果没有,则为generated。您还可以在XJC调用期间指定包名称。

xjc -p com.example.foo schema.xsd