Question

我们可以通过ServletContext#setRequestCharacterEncoding（从Servlet 4.0开始）设置用于读取请求正文的默认字符编码。

我认为可以使用HttpServletRequest#getReader来设置ServletContext#setRequestCharacterEncoding(*)的字符编码。

但是HttpServletRequest#getReader返回的读者似乎没有使用ServletContext#setRequestCharacterEncoding设置的编码来解码字符。

我的问题是：

为什么ServletContext#setRequestCharacterEncoding对HttpServletRequest#getReader没有影响（但对HttpServletRequest#getParameter有影响）？
是否有描述这种ServletContext#setRequestCharacterEncoding和HttpServletRequest#getReader行为的规范？

（我阅读了Servlet规范版本4.0，但找不到有关此类行为的任何规范。）

我创建了一个简单的战争应用程序并测试了ServletContext#setRequestCharacterEncoding。

[Env]

Tomcat9.0.19（我不更改任何默认配置）
JDK11
Windows8.1

[index.html]

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
</head>
<body>
    <form action="/SimpleWarApp/app/simple" method="post">
        <!-- The value is Japanese character '\u3042' -->
        <input type="text" name="hello" value="あ"/>
        <input type="submit" value="submit!"/>
    </form>
    <button type="button" id="the_button">post</button>
    <script>
        document.getElementById('the_button').addEventListener('click', function() {
            var xhttp = new XMLHttpRequest();
            xhttp.open('POST', '/SimpleWarApp/app/simple');
            xhttp.setRequestHeader('Content-Type', 'text/plain');
            <!-- The body content is Japanese character '\u3042' -->
            xhttp.send('あ');
        });
    </script>
</body>
</html>

[InitServletContextListener.java]

@WebListener
public class InitServletContextListener implements ServletContextListener {
    @Override
    public void contextInitialized(ServletContextEvent sce) {
        sce.getServletContext().setRequestCharacterEncoding("UTF-8");
    }
}

[SimpleServlet.java]

@WebServlet("/app/simple")
@SuppressWarnings("serial")
public class SimpleServlet extends HttpServlet {

    @Override
    protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
        // req.setCharacterEncoding("UTF-8");
        System.out.println("requestCharacterEncoding : " + req.getServletContext().getRequestCharacterEncoding());
        System.out.println("req.getCharacterEncoding() : " + req.getCharacterEncoding());

        String hello = req.getParameter("hello");
        if (hello != null) {
            System.out.println("hello : " + req.getParameter("hello"));
        } else {
            System.out.println("body : " + req.getReader().readLine());
        }
    }
}

我没有任何Servlet过滤器。以上三个都是该战争应用程序的所有组件。（GitHub）

情况1：当我使用参数“ hello”提交表单时，“ hello”的值已成功解码，如下所示。

requestCharacterEncoding : UTF-8
req.getCharacterEncoding() : UTF-8
hello : あ

情况2：当我单击“发布”并发送文本内容时，请求正文无法成功解码，如下所示。（尽管我确认请求主体是由UTF-8编码的，例如：E3 81 82）

requestCharacterEncoding : UTF-8
req.getCharacterEncoding() : UTF-8
body : ???

情况3：当我还在servlet的'doPost'方法的第一行中使用HttpServletRequest#setCharacterEncoding设置编码时，请求主体已成功解码。

requestCharacterEncoding : UTF-8
req.getCharacterEncoding() : UTF-8
body : あ

情况4：当我使用http.setRequestHeader('Content-Type', 'text/plain; charset=UTF-8'); JavaScript时，请求正文已成功解码。

requestCharacterEncoding : UTF-8
req.getCharacterEncoding() : UTF-8
body : あ

案例5：当我不致电req.getParameter("hello")时，请求正文将无法成功解码。

requestCharacterEncoding : UTF-8
req.getCharacterEncoding() : UTF-8
body : ???

情况6：当我不致电ServletContext#setRequestCharacterEncoding的{{1}}时，未设置任何字符编码。

InitServletContextListener.java

[注意]

（*）我之所以这样认为是因为
- （1）requestCharacterEncoding : null req.getCharacterEncoding() : null body : ???的Java文档说
  
  “阅读器根据身体上使用的字符编码来翻译字符数据。”
- （2）HttpServletRequest#getReader的Java文档说
  
  “返回此请求正文中使用的字符编码的名称。”
- （3）HttpServletRequest#getCharacterEncoding的Java文档也说
  
  “按照优先级从高到低的顺序，咨询了以下用于指定请求字符编码的方法：每个请求，每个Web应用程序（使用ServletContext.setRequestCharacterEncoding，部署描述符）”。
HttpServletRequest#getCharacterEncoding工作正常。当我使用ServletContext#setResponseCharacterEncoding时，ServletContext#setResponseCharacterEncoding返回的编写者通过响应主体设置的字符编码对其进行编码。

Answer 1

这是一个Apache Tomcat错误（特定于getReader()），感谢您在Tomcat用户邮件列表中的报告，该错误将从9.0.21开始修复。

出于好奇，这里是fix。

为什么“ ServletContext＃setRequestCharacterEncoding”对“ HttpServletRequest＃getReader”没有影响？

1 个答案: