转义XML字符而不转义CDATA标记

时间:2016-08-11 14:42:47

标签: java jaxb

我尝试使用具有一些CDATA字段的JAXB来序列化一个类,以及一些包含需要转义的特殊字符的字段(包括<和>)。问题是我无法使逃逸处理能够在这两种情况下正常工作。

使用自定义CDATA适配器,如果我在marshaller上设置以下属性,

jaxbMarshaller.setProperty(CharacterEscapeHandler.class.getName(),
        (CharacterEscapeHandler) (ch, start, length, isAttVal, out) -> out.write(ch, start, length));

我明白了:

<key1><![CDATA[Test]]></key1> # What I want
<key2>some_>_value</key2>     # Invalid XML

如果我删除该属性并让jaxb处理它自己的转义,我得到:

<key1>&lt;![CDATA[Test]]&gt;  # Not what I want
<key2>some_&lt;_value</key2>  # What I want

但我需要的是:

<key1><![CDATA[Test]]></key1>
<key2>some_&lt;_value</key2>

我有什么方法可以通过这种方式使用我的转义处理函数吗?

1 个答案:

答案 0 :(得分:0)

您可以使用com.sun.xml.bind.marshaller.CharacterEscapeHandler界面来解决您的问题。

示例Java类:

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement
public class AA {

    String B;
    String C;


    public String getB() {
        return B;
    }

    @XmlElement
    public void setB(String b) {
        B = b;
    }

    @Override
    public String toString() {
        return "AA [B=" + B + ", C=" + C + ", D=" + D + "]";
    }

    public String getC() {
        return C;
    }

    @XmlElement
    public void setC(String c) {
        C = c;
    }

    public String getD() {
        return D;
    }

    @XmlElement
    public void setD(String d) {
        D = d;
    }

    String D;
}

示例XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<aa>
  <b>Title of Feed Item</b>
  <c>some_>_value</c>
  <d>
    <![CDATA[Test]]>
  </d>
</aa>

序列化呼叫:

AA ref = new AA();
ref.setB("Title of Feed Item");
ref.setC("some_>_value");
ref.setD("<![CDATA[Test]]>");
File file = new File("Test.xml");   
JAXBContext jaxbContext = JAXBContext.newInstance(AA.class);
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
final Map<String, String> replacement=new HashMap<String, String>();
    replacement.put(">", "&gt;");
    replacement.put("<", "&lt;");
    final Pattern pattern=Pattern.compile("[<|>]");

jaxbMarshaller.setProperty(CharacterEscapeHandler.class.getName(),
            new CharacterEscapeHandler() {

                @Override
                public void escape(char[] arg0, int arg1, int arg2,
                        boolean arg3, Writer arg4) throws IOException {
                    if (String.valueOf(arg0).contains("CDATA"))
                        arg4.write(arg0, arg1, arg2);
                    else {
                        StringBuffer buffer = new StringBuffer();
                        Matcher matcher = pattern.matcher(String
                                .valueOf(arg0));
                        while (matcher.find()) {
                            matcher.appendReplacement(buffer,
                                    replacement.get(matcher.group()));
                        }
                        matcher.appendTail(buffer);
                        char t[] = buffer.toString().toCharArray();
                        arg4.write(t, arg1, t.length);
                    }
                }
            });



jaxbMarshaller.marshal(ref, file);
jaxbMarshaller.marshal(ref, System.out);

编辑:我使用括号的正则表达式修改了代码。

对于正则表达式样本,我引用了this link