我有一个Tomcat 7的默认设置,并且所有与java相关的配置都使用utf-8。
这不起作用(utf-8字符被破坏):
<%@ page language="java" pageEncoding="utf-8" contentType="text/html; charset=utf-8"%>
<%@ page import="java.net.*" %>
<%@ page import="java.io.*" %>
<%
URL target = new URL("http://en.wikipedia.org/wiki/Main_Page");
Reader input = new BufferedReader(new InputStreamReader(target.openStream()));
StringWriter buffer = new StringWriter();
char[] chrs = new char[1024 * 4];
int n = 0;
while (-1 != (n = input.read(chrs)))
{
buffer.write(chrs, 0, n);
}
StringReader reader = new StringReader(buffer.toString());
n = 0;
while (-1 != (n = reader.read(chrs)))
{
out.write(chrs, 0, n);
}
%>
这样做,但记录IllegalStateExceptions:
<%@ page language="java" pageEncoding="utf-8" contentType="text/html; charset=utf-8"%>
<%@ page import="java.net.*" %>
<%@ page import="java.io.*" %>
<%
URL target = new URL("http://en.wikipedia.org/wiki/Main_Page");
Reader input = new BufferedReader(new InputStreamReader(target.openStream()));
StringWriter buffer = new StringWriter();
char[] chrs = new char[1024 * 4];
int n = 0;
while (-1 != (n = input.read(chrs)))
{
buffer.write(chrs, 0, n);
}
StringReader reader = new StringReader(buffer.toString());
OutputStreamWriter output = new OutputStreamWriter(response.getOutputStream());
n = 0;
while (-1 != (n = reader.read(chrs)))
{
output.write(chrs, 0, n);
}
%>
我一直在寻找但没有找到答案。这是Tomcat中的一个错误,还是我缺少的东西?
答案 0 :(得分:3)
当你构造InputStreamReader
而没有指定charset作为第二个参数时,将使用平台默认编码,通常是ISO-8859-1。您需要指定与目标URL的响应头中指定的相同的字符集,即UTF-8。
input = new BufferedReader(new InputStreamReader(target.openStream(), "UTF-8"));
导致IllegalStateException
是因为您在JSP而不是Servlet中执行此操作。 JSP内部使用response.getWriter()
,但您在JSP scriptlet 中调用response.getOutputStream()
。这不能像他们的javadoc中所解释的那样同时完成。而且,双循环远没有效率。只需在第一个循环中立即写入out
(response.getWriter()
),而不是写入某个缓冲区。
无论如何,这是一种可怕的代理方式。而是使用Servlet或改为使用JSTL <c:import>
。
<c:import url="http://en.wikipedia.org/wiki/Main_Page" />