JAXB - 按顺序解析XML列表(逐个)

时间:2017-08-17 08:48:45

标签: java xml jaxb

我必须解析XML哪个根元素是一个列表:

<SomeObjectsCollection>
  <SomeObject><!-- content1 --></SomeObject>
  <SomeObject><!-- content2 --></SomeObject>
  <SomeObject><!-- content3 --></SomeObject>
    <!-- and hundreds more -->
</SomeObjectsCollection>

出于性能原因,我不想将整个XML解析为内存。我宁愿写一些类似Iterable<SomeObjetType>的内容,它不会强制用户将整个未编组的列表保存在内存中,只需逐个处理它。

到目前为止,我编写了一个实现Iterable<SomeObject>和我自己的Iterator的类:

public class SomeObjectsIterableParser implements Iterable<SomeObjectType> {

  private final Unmarshaller jaxbUnmarshaller;
  private final XMLStreamReader xmlReader;

  public SomeObjectsIterableParser(Schema schema, java.io.Reader xmlStringReader) throws ExtractorException {
    try {
      jaxbUnmarshaller = JAXBContext.newInstance(SomeObjectType.class).createUnmarshaller();
      xmlReader = XMLInputFactory.newFactory().createXMLStreamReader(xmlStringReader);
    } catch (JAXBException | XMLStreamException e) {
      throw new ExtractorException("Could not create jaxbUnmarshaller", e);
    }
    jaxbUnmarshaller.setSchema(schema); //turns on schema validation

    //Move reader to first occurence of SomeObject - really necessary?
    try {
      while (xmlReader.hasNext()) {
        if (!xmlReader.isStartElement() || !xmlReader.getLocalName().equals("SomeObject"))
          xmlReader.next();
        else break;
      }
    } catch (XMLStreamException e) {
      e.printStackTrace();
    }

  }

  @Override
  public Iterator<SomeObjectType> iterator() {
    return new MyIterator();
  }

  class MyIterator implements Iterator<SomeObjectType> {

    @Override
    public boolean hasNext() {
      try {
        return xmlReader.hasNext();
      } catch (XMLStreamException e) {
        throw new RuntimeException(e);
      }
    }

    @Override
    public SomeObjectType next() {
      try {
        return (SomeObjectType) jaxbUnmarshaller.unmarshal(xmlReader);
      } catch (JAXBException | XMLStreamException e) {
        throw new RuntimeException(e);
      }
    }

    @Override
    public void remove() {
      throw new UnsupportedOperationException("Not supported yet");
    }
  }
}

我在next()方法中收到例外消息:org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 17; cvc-elt.1: Cannot find the declaration of element 'SomeObject'.

我做错了什么?

1 个答案:

答案 0 :(得分:0)

我发现the article描述了我想要做的事情。结果我写了这样的代码:

public class SomeObjectsIterableParser implements Iterable<SomeObjectType> {

  private final Unmarshaller jaxbUnmarshaller;
  private final XMLStreamReader xmlReader;

  public SomeObjectsIterableParser(Schema schema, Reader SomeObjectResponse) throws ExtractorException {
    try {
      jaxbUnmarshaller = JAXBContext.newInstance(SomeObjectType.class).createUnmarshaller();
      xmlReader = XMLInputFactory.newFactory().createXMLStreamReader(SomeObjectResponse);
    } catch (JAXBException | XMLStreamException e) {
      throw new ExtractorException("Could not create jaxbUnmarshaller", e);
    }
    //jaxbUnmarshaller.setSchema(schema); //schema can handle only root element
    advanceReaderToFirstProfile();
  }

  private void advanceReaderToFirstProfile() {
    try {
      xmlReader.nextTag();
      while(!xmlReader.getLocalName().equals("SomeObject")) {
        xmlReader.nextTag();
      }
    } catch (XMLStreamException e) {
      e.printStackTrace();
    }
  }

  @Override
  public Iterator<SomeObjectType> iterator() {
    return new MyIterator();
  }

  class MyIterator implements Iterator<SomeObjectType> {

    @Override
    public boolean hasNext() {
      try {
        if (xmlReader.isWhiteSpace() && xmlReader.hasNext()) {
          //ommit witespaces
          xmlReader.nextTag();
        }
      } catch (XMLStreamException e) {
        throw new RuntimeException(e);
      }
      return xmlReader.isStartElement() 
          && xmlReader.getLocalName().equals("SomeObject");
    }

    @Override
    public SomeObjectType next() {
      try {
        JAXBElement<SomeObjectType> element = jaxbUnmarshaller.unmarshal(xmlReader, SomeObjectType.class);
        return element.getValue();
      } catch (JAXBException | XMLStreamException e) {
        throw new RuntimeException(e);
      }
    }

    @Override
    public void remove() {
      throw new UnsupportedOperationException("Not supported yet");
    }
  }
}

请注意3认为:

  1. 架构验证不能应用于非根元素。有相关问题here
  2. 使用语法:
    JAXBElement<SomeObjectType> element = jaxbUnmarshaller.unmarshal(xmlReader, SomeObjectType.class);
    而不是 (SomeObjectType) jaxbUnmarshaller.unmarshal(xmlReader);
    否则你会收到一个例外:
      

    java.lang.RuntimeException:javax.xml.bind.UnmarshalException

         
        
    • 链接异常:
        [com.sun.istack.internal.SAXParseException2; lineNumber:2; columnNumber:22;意外元素(uri:“”,local:“SomeObject”)。预期的元素是(无)]
    •   
  3. 比示例中更谨慎处理异常。 IterableIterator接口不允许您抛出非运行时异常。