编辑:我对charcter进行了硬编码并使用repsonse编写器来编写它,它仍然存在 Knigsberger
response.setCharacterEncoding("UTF-8"); response.setContentType(contentType); //if(contentType!=null)response.setHeader("Content-Type",contentType); Writer writer = response.getWriter();//new OutputStreamWriter(response.getOutputStream(),"UTF-8"); System.err.println("character encoding is "+response.getCharacterEncoding()); writer.write("Königsberger "); writer.flush();
编辑: 我在调用getWriter()之前尝试了setContentType和setContentEncoding,但输出仍然没有区别:
if(res.length()>0){ //pw.write(res); response.setCharacterEncoding("UTF-8"); response.setContentType(contentType); //if(contentType!=null)response.setHeader("Content-Type",contentType); Writer writer = response.getWriter();//new OutputStreamWriter(response.getOutputStream(),"UTF-8"); System.err.println("character encoding is "+response.getCharacterEncoding()); writer.write(res); writer.flush(); }
我正在阅读一些德语字符然后从java servlet输出xml, 这是我在UTF8中读取它们的方式:
int len=0; byte[]buffer=new byte[1024]; OutputStream os = sock.getOutputStream(); InputStream is = sock.getInputStream(); query += "\r\n"; os.write(query.getBytes("UTF8"));//iso8859_1")); do{ len = is.read(buffer); if (len>0) { if(outstring==null)outstring=new StringBuffer(); outstring.append(new String(buffer,0,len, "UTF8")); } }while(len>0); System.out.println(outstring);
System.out正确输出字符串: Königsberger
然而,当我使用charset = UTF-8从我的servletResponse重新发送此字符串时 它变得狼吞虎咽:K nigsberger
private void outputResponse(String res, HttpServletRequest request, HttpServletResponse response) throws IOException { String outputFormat = getOutputFormat(request); String contentType=null; PrintWriter pw = response.getWriter(); //response.setCharacterEncoding("UTF-8"); System.err.println("output "+res); contentType= "text/xml; charset=UTF-8"; res="<?xml version=\"1.0\" encoding=\"utf-8\"?>" + res; if(contentType!=null)response.setHeader("Content-Type",contentType); if(res.length()>0){ pw.write(res); } pw.flush(); }
答案 0 :(得分:3)
do{
len = is.read(buffer);
if (len>0) {
if(outstring==null) outstring=new StringBuffer();
outstring.append(new String(buffer,0,len, "UTF8"));
}
}while(len>0);
这不是解码UTF-8的好方法,因为字符在缓冲区边界(details here)上可能会被破坏。 UTF-8是可变宽度编码,因此字符需要1到4个字节才能存储。如果它正常运作,你就会变得幸运。最好使用Reader / Writer类(details here)进行编码和解码。
我认为您需要在致电getWriter
之前致电setContentType或setCharacterEncoding。我不认为直接致电setHeader
就足够了。
此servlet代码将正确编码并将样本字符串作为UTF-8数据传输:
@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/xml; charset=UTF-8");
PrintWriter pw = response.getWriter();
pw.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
pw.write("<data>K\u00F6nigsberger</data>");
pw.flush();
pw.close();
}
请注意,我使用转义序列\u00F6
发出字符U + 00F6(ö)以确保我不会在文本编辑器或编译期间损坏字符过程(see here for more details)。
数据是否可能在客户端上被误解?使用十六进制编辑器检查输出。
编码为UTF-8,"K\u00F6nigsberger"
应该成为字节序列:
4b c3 b6 6e 69 67 73 62 65 72 67 65 72
...字符U + 00F6(ö)变为c3 b6
。您可以使用这样的代码来检查您的值:
public static void main(String[] args) throws IOException {
String konigsberger = "K\u00F6nigsberger";
dumpHex(System.out, konigsberger.getBytes("UTF-8"));
}
private static void dumpHex(PrintStream out, byte[] data) {
for (byte b : data) {
out.format("%02x ", b);
}
out.println();
}
答案 1 :(得分:1)
您应该按照示例操作,让servlet response
了解要遵循的编码:
response.setContentType("text/html; charset=UTF-8");
response.setCharacterEncoding("UTF-8");
ServletOutputStream out =response.getOutputStream();
out.write(output.getBytes("UTF-8"));
答案 2 :(得分:0)
你总是可以使用这样的实体:
<test>
ä
ü
å
</test>
得到:
<test>
ä
ü
å
</test>
也许不完全是你想要的,但是一个很好的解决方法。您可以使用utf8-chartable.de等网站查找所需的值。
答案 3 :(得分:0)
我也遇到了同样的问题。我刚刚完成了以下事情并且工作正常:
byte[] k =xml.getBytes(UTF8_CHARSET); // xml is the string with unicode content. getBytes("UTF-16") encodes given String into a sequence of bytes and returns an array of bytes. you can use xml.getBytes(UTF-16); for utf-16 encoding
response.setContentType("text/xml");
response.setContentLength(k.length);
response.getOutputStream().write(k);
response.getOutputStream().flush();
response.getOutputStream().close();