Question

嘿我正在处理Web应用程序，并且从txt文件读取UTF-8字符时遇到问题。我得到UTF-8的工作方式：UTF-8 web encoding（除了在导入时它工作正常）。我尝试了很多想法（特别是来自：read UTF-8 string literal java），但没有任何效果，我不明白为什么。

重要的代码片段：

import.jsp

import java.nio.charset.StandardCharsets;

@WebServlet("/ImportData")
@MultipartConfig
public class ImportData extends HttpServlet {

    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
         Part filePart = request.getPart("file"); // Retrieves <input type="file" name="file">
         BufferedReader buf = new BufferedReader(new InputStreamReader(filePart.getInputStream(), StandardCharsets.UTF_8.name()));
         String lineJustFetched = null;
         String[] wordsArray = null;
         ArrayList<String> texts = new ArrayList<String>();
         while(true){
             lineJustFetched = buf.readLine();
             if(lineJustFetched == null){  
                 break; 
             }else{
                 wordsArray = lineJustFetched.split("\t");
                 for(String each : wordsArray){
                     texts.add(each);
                 }
             }
         }
         buf.close();

        System.out.println(texts);

        //create Import Data in Backend and write it into db

        response.sendRedirect("import.jsp");
    }
}

ImportData Servlet：

New-WebApplication -Name "PreviewApp" -Site "MySite" -ApplicationPool "MySite" -PhysicalPath "c:/inetpub/wwwroot/previewapp"

系统详情：带有Java 1.7的Tomcat服务器7

UTF-8字符的文本外印是一个正方形，在html输入（和文本）中是一个�而不是UTF-8字符

所以我的问题是：我在哪里以及为什么丢失了UTF-8编码？

Answer 1

好的，我看起来不对......文件不是UTF-8编码的（它是ANSI编码的），使用UTF-8编码，这段代码工作得很好。

要使其可以运行其他编码，您只需更改InputStreamReader编码即可正确读取文件。

e.g。

 BufferedReader buf = new BufferedReader(new 
       InputStreamReader(filePart.getInputStream(), "Cp1252"));

（对于windows-ANSI）

如何将UTF-8编码的txt文件导入jsp？

1 个答案: