如何使用JAXB生成CDATA块?

时间:2010-06-28 21:45:40

标签: java xml jaxb cdata

我正在使用JAXB将我的数据序列化为XML。类代码很简单,如下所示。我想生成包含CDATA块的XML,用于某些Args的值。例如,当前代码生成此XML:

<command>
   <args>
      <arg name="test_id">1234</arg>
      <arg name="source">&lt;html>EMAIL&lt;/html></arg>
   </args>
</command>

我想在CDATA中包装“source”arg,使其如下所示:

<command>
   <args>
      <arg name="test_id">1234</arg>
      <arg name="source"><[![CDATA[<html>EMAIL</html>]]></arg>
   </args>
</command>

如何在以下代码中实现此目的?

@XmlRootElement(name="command")
public class Command {

        @XmlElementWrapper(name="args")
        protected List<Arg>  arg;
    }
@XmlRootElement(name="arg")
public class Arg {

        @XmlAttribute
        public String name;
        @XmlValue
        public String value;

        public Arg() {};

        static Arg make(final String name, final String value) {
            Arg a = new Arg();
            a.name=name; a.value=value;
            return a; }
    }

10 个答案:

答案 0 :(得分:27)

注意:我是EclipseLink JAXB (MOXy)主管,是JAXB (JSR-222)专家组的成员。

如果您使用MOXy作为JAXB提供商,那么您可以利用@XmlCDATA扩展名:

package blog.cdata;

import javax.xml.bind.annotation.XmlRootElement;
import org.eclipse.persistence.oxm.annotations.XmlCDATA;

@XmlRootElement(name="c")
public class Customer {

   private String bio;

   @XmlCDATA
   public void setBio(String bio) {
      this.bio = bio;
   }

   public String getBio() {
      return bio;
   }

}

了解更多信息

答案 1 :(得分:20)

使用JAXB的Marshaller#marshal(ContentHandler)编组到ContentHandler对象中。只需覆盖您正在使用的ContentHandler实现的characters方法(例如JDOM的SAXHandler,Apache的XMLSerializer等):

public class CDataContentHandler extends (SAXHandler|XMLSerializer|Other...) {
    // see http://www.w3.org/TR/xml/#syntax
    private static final Pattern XML_CHARS = Pattern.compile("[<>&]");

    public void characters(char[] ch, int start, int length) throws SAXException {
        boolean useCData = XML_CHARS.matcher(new String(ch,start,length)).find();
        if (useCData) super.startCDATA();
        super.characters(ch, start, length);
        if (useCData) super.endCDATA();
    }
}

比使用XMLSerializer.setCDataElements(...)方法更好,因为您不必对任何元素列表进行硬编码。只有在需要时才会自动输出CDATA块

答案 2 :(得分:16)

解决方案评论:

  • fred的答案只是一种解决方法,当Marshaller链接到Schema时验证内容时会失败,因为您只修改字符串文字并且不创建CDATA部分。因此,如果您只将 foo 中的字符串重写为&lt;![CDATA [foo]]&gt; ,则字符串的长度由Xerces识别为15而不是3。
  • MOXy解决方案是特定于实现的,不仅适用于JDK的类。
  • getSerializer的解决方案引用了不推荐使用的XMLSerializer类。
  • 解决方案LSSerializer只是一种痛苦。

我使用 XMLStreamWriter 实现修改了a2ndrade的解决方案。这个解决方案非常有效。

XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLStreamWriter streamWriter = xof.createXMLStreamWriter( System.out );
CDataXMLStreamWriter cdataStreamWriter = new CDataXMLStreamWriter( streamWriter );
marshaller.marshal( jaxbElement, cdataStreamWriter );
cdataStreamWriter.flush();
cdataStreamWriter.close();

这是CDataXMLStreamWriter实现。委托类只是将所有方法调用委托给给定的XMLStreamWriter实现。

import java.util.regex.Pattern;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamWriter;

/**
 * Implementation which is able to decide to use a CDATA section for a string.
 */
public class CDataXMLStreamWriter extends DelegatingXMLStreamWriter
{
   private static final Pattern XML_CHARS = Pattern.compile( "[&<>]" );

   public CDataXMLStreamWriter( XMLStreamWriter del )
   {
      super( del );
   }

   @Override
   public void writeCharacters( String text ) throws XMLStreamException
   {
      boolean useCData = XML_CHARS.matcher( text ).find();
      if( useCData )
      {
         super.writeCData( text );
      }
      else
      {
         super.writeCharacters( text );
      }
   }
}

答案 3 :(得分:10)

以下是上述网站引用的代码示例:

import java.io.File;
import java.io.StringWriter;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;

import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.w3c.dom.Document;

public class JaxbCDATASample {

    public static void main(String[] args) throws Exception {
        // unmarshal a doc
        JAXBContext jc = JAXBContext.newInstance("...");
        Unmarshaller u = jc.createUnmarshaller();
        Object o = u.unmarshal(...);

        // create a JAXB marshaller
        Marshaller m = jc.createMarshaller();

        // get an Apache XMLSerializer configured to generate CDATA
        XMLSerializer serializer = getXMLSerializer();

        // marshal using the Apache XMLSerializer
        m.marshal(o, serializer.asContentHandler());
    }

    private static XMLSerializer getXMLSerializer() {
        // configure an OutputFormat to handle CDATA
        OutputFormat of = new OutputFormat();

        // specify which of your elements you want to be handled as CDATA.
        // The use of the '^' between the namespaceURI and the localname
        // seems to be an implementation detail of the xerces code.
        // When processing xml that doesn't use namespaces, simply omit the
        // namespace prefix as shown in the third CDataElement below.
        of.setCDataElements(
            new String[] { "ns1^foo",   // <ns1:foo>
                   "ns2^bar",   // <ns2:bar>
                   "^baz" });   // <baz>

        // set any other options you'd like
        of.setPreserveSpace(true);
        of.setIndenting(true);

        // create the serializer
        XMLSerializer serializer = new XMLSerializer(of);
        serializer.setOutputByteStream(System.out);

        return serializer;
    }
}

答案 4 :(得分:9)

出于与迈克尔·恩斯特相同的原因,我对这里的大部分答案并不满意。我不能使用他的解决方案,因为我的要求是将CDATA标签放在一组定义的字段中 - 就像raiglstorfer的OutputFormat解决方案一样。

我的解决方案是编组DOM文档,然后执行null XSL转换来执行输出。变形金刚允许您设置哪些元素包装在CDATA标签中。

Document document = ...
jaxbMarshaller.marshal(jaxbObject, document);

Transformer nullTransformer = TransformerFactory.newInstance().newTransformer();
nullTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
nullTransformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "myElement {myNamespace}myOtherElement");
nullTransformer.transform(new DOMSource(document), new StreamResult(writer/stream));

此处的详细信息:http://javacoalface.blogspot.co.uk/2012/09/outputting-cdata-sections-with-jaxb.html

答案 5 :(得分:5)

以下简单方法在JAX-B中添加了CDATA支持,该支持本身不支持CDATA:

  1. 声明自定义简单类型CDataString 扩展字符串以标识应通过CDATA处理的字段
  2. 创建自定义CDataAdapter ,以解析和打印CDataString中的内容
  3. 使用 JAXB绑定链接CDataString和CDataAdapter 。 CdataAdapter将在Marshall / Unmarshall时间向/从CdataStrings添加/删除
  4. 声明一个自定义字符转义处理程序,它在打印CDATA字符串时不会转义字符并将其设置为Marshaller CharacterEscapeEncoder
  5. Et瞧,任何CDataString元素都将在Marshall时封装。在解组时,会自动删除。

答案 6 :(得分:4)

补充@a2ndrade的答案。

我发现一个类在JDK 8中扩展。但是注意到该类在com.sun包中。您可以制作代码的一个副本,以防将来JDK中删除此类。

public class CDataContentHandler extends com.sun.xml.internal.txw2.output.XMLWriter {
  public CDataContentHandler(Writer writer, String encoding) throws IOException {
    super(writer, encoding);
  }

  // see http://www.w3.org/TR/xml/#syntax
  private static final Pattern XML_CHARS = Pattern.compile("[<>&]");

  public void characters(char[] ch, int start, int length) throws SAXException {
    boolean useCData = XML_CHARS.matcher(new String(ch, start, length)).find();
    if (useCData) {
      super.startCDATA();
    }
    super.characters(ch, start, length);
    if (useCData) {
      super.endCDATA();
    }
  }
}

使用方法:

  JAXBContext jaxbContext = JAXBContext.newInstance(...class);
  Marshaller marshaller = jaxbContext.createMarshaller();
  StringWriter sw = new StringWriter();
  CDataContentHandler cdataHandler = new CDataContentHandler(sw,"utf-8");
  marshaller.marshal(gu, cdataHandler);
  System.out.println(sw.toString());

结果示例:

<?xml version="1.0" encoding="utf-8"?>
<genericUser>
  <password><![CDATA[dskfj>><<]]></password>
  <username>UNKNOWN::UNKNOWN</username>
  <properties>
    <prop2>v2</prop2>
    <prop1><![CDATA[v1><]]></prop1>
  </properties>
  <timestamp/>
  <uuid>cb8cbc487ee542ec83e934e7702b9d26</uuid>
</genericUser>

答案 7 :(得分:2)

答案 8 :(得分:0)

以下代码将阻止编码CDATA元素:

Marshaller marshaller = context.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);

StringWriter stringWriter = new StringWriter();
PrintWriter printWriter = new PrintWriter(stringWriter);
DataWriter dataWriter = new DataWriter(printWriter, "UTF-8", new CharacterEscapeHandler() {
    @Override
    public void escape(char[] buf, int start, int len, boolean b, Writer out) throws IOException {
        out.write(buf, start, len);
    }
});

marshaller.marshal(data, dataWriter);

System.out.println(stringWriter.toString());

它还会将UTF-8作为您的编码。

答案 9 :(得分:0)

Just a word of warning: according to documentation of the javax.xml.transform.Transformer.setOutputProperty(...) you should use the syntax of qualified names, when indicating an element from another namespace. According to JavaDoc (Java 1.6 rt.jar):

"(...) For example, if a URI and local name were obtained from an element defined with , then the qualified name would be "{ http://xyz.foo.com/yada/baz.html} foo。请注意,没有使用前缀。“

这不起作用 - 来自Java 1.6 rt.jar的实现类,意思是com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl只有正确地解释属于不同命名空间的元素,它们被声明为“http://xyz.foo.com/yada/baz.html:foo”,因为在实现中有人正在解析它寻找最后一个冒号。所以不是调用:

transformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "{http://xyz.foo.com/yada/baz.html}foo")

应该根据JavaDoc工作,但最终被解析为“http”和“//xyz.foo.com/yada/baz.html”,你必须调用

transformer.setOutputProperty(OutputKeys.CDATA_SECTION_ELEMENTS, "http://xyz.foo.com/yada/baz.html:foo")

至少在Java 1.6中。