我正在使用 castor API将对象转换为XML。
我收到以下异常
引起:org.xml.sax.SAXException:字符''是无效的XML字符。
我知道正确的方法是纠正源,但是有很多这样的无效字符。
在另一个论坛中,有人建议在编组之前对java对象内容进行编码,然后解码输出(Base64
)。该方法看起来非常麻烦,并且不能很好地适应解决方案。
我需要一种方法在编组期间跳过这些字符,生成的XML应该包含字符。
答案 0 :(得分:0)
/**
* This method ensures that the output String has only
* valid XML unicode characters as specified by the
* XML 1.0 standard. For reference, please see
* <a href="http://www.w3.org/TR/2000/REC-xml-20001006#NT-Char">the
* standard</a>. This method will return an empty
* String if the input is null or empty.
*
* @param in The String whose non-valid characters we want to remove.
* @return The in String, stripped of non-valid characters.
*/
public String stripNonValidXMLCharacters(String in) {
StringBuffer out = new StringBuffer(); // Used to hold the output.
char current; // Used to reference the current character.
if (in == null || ("".equals(in))) return ""; // vacancy test.
for (int i = 0; i < in.length(); i++) {
current = in.charAt(i); // NOTE: No IndexOutOfBoundsException caught here; it should not happen.
if ((current == 0x9) ||
(current == 0xA) ||
(current == 0xD) ||
((current >= 0x20) && (current <= 0xD7FF)) ||
((current >= 0xE000) && (current <= 0xFFFD)) ||
((current >= 0x10000) && (current <= 0x10FFFF)))
out.append(current);
}
return out.toString();
}
答案 1 :(得分:0)
如果您希望生成的XML包含此类
字符
,然后XML 1.1规范可能会有所帮助。
可以将Castor配置为使用自定义org.exolab.castor.xml.XMLSerializerFactory
和org.exolab.castor.xml.Serializer
实现编组到XML 1.1中:
package com.foo.castor;
......
import org.exolab.castor.xml.BaseXercesOutputFormat;
import org.exolab.castor.xml.Serializer;
import org.exolab.castor.xml.XMLSerializerFactory;
import org.xml.sax.DocumentHandler;
import com.sun.org.apache.xml.internal.serialize.OutputFormat;
import com.sun.org.apache.xml.internal.serialize.XML11Serializer;
@SuppressWarnings("deprecation")
public class CastorXml11SerializerFactory implements XMLSerializerFactory {
private static class CastorXml11OutputFormat extends BaseXercesOutputFormat{
public CastorXml11OutputFormat(){
super._outputFormat = new OutputFormat();
}
}
private static class CastorXml11Serializer implements Serializer {
private XML11Serializer serializer = new XML11Serializer();
@Override
public void setOutputCharStream(Writer out) {
serializer.setOutputCharStream(out);
}
@Override
public DocumentHandler asDocumentHandler() throws IOException {
return serializer.asDocumentHandler();
}
@Override
public void setOutputFormat(org.exolab.castor.xml.OutputFormat format) {
serializer.setOutputFormat((OutputFormat)format.getFormat());
}
@Override
public void setOutputByteStream(OutputStream output) {
serializer.setOutputByteStream(output);
}
}
@Override
public Serializer getSerializer() {
return new CastorXml11Serializer();
}
@Override
public org.exolab.castor.xml.OutputFormat getOutputFormat() {
return new CastorXml11OutputFormat();
}
}
全局castor.properties
文件中的
org.exolab.castor.xml.serializer.factory=com.foo.castor.CastorXml11SerializerFactory
org.exolab.castor.xml.version=1.1
或按特定setCastorProperties
的{{1}}方法设置这两个属性。
但请注意,XML 1.1 is not accepted by browsers和not all XML parsers can parse XML 1.1 out of the box。