Question

我正在尝试用android studio编写一个词汇应用程序。我有一个txt文件，其中包含UTF-8格式的词汇表。

akarui _ あかるい _ bright

读取文件并添加到字典的代码如下所示：

public Map<String, String> adjectives_ej = new HashMap<String, String>();
try {
        InputStream in = am.open("adjectives_utf8.txt");
        //BufferedReader reader = new BufferedReader(new InputStreamReader(in, "UTF-8"));
        BufferedReader br = new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8));
        StringBuilder sb = new StringBuilder();
        String line;
        while ((line = br.readLine()) != null){
            // printout first line
            if (line != ""){

                String[] parts = line.split("_");
                byte[] bytes = parts[1].getBytes("UTF-8");
                String japaneseString = new String(bytes, "UTF-8");
                Log.d("voc", japaneseString);
                adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
                adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));
            }

        }
TextView textView = new TextView(this);
textView.setText(adjectives_ej.get("bright"));
ViewGroup layout = (ViewGroup)        
findViewById(R.id.activity_adjectives);
layout.addView(textView);

如果我想查看Log.d("test", adjectives_ej.get("bright"));的输出，我会收到错误消息：

java.lang.RuntimeException: Unable to start activity ComponentInfo{ericwolf.genkiii/ericwolf.genkiii.Adjectives}: java.lang.NullPointerException: println needs a message

但Log.d("voc", japaneseString);为我提供了正确的输出：07-31 19:42:41.600 25439-25439/ericwolf.genkiii D/voc: くらい

另外在“while”循环中设置textView.setText(parts[1]);也可以。所以我不明白这里的区别。将它保存在字典中是否有问题？

Answer 1

感谢您分享txt文件。看起来很好。虽然它确实包含BOM，但我认为这不会导致任何问题。

这是其中一个问题：

字体问题。也许你用来显示的字体不支持亚洲字符集。

更有可能的是，UTF8和后面的多重解码/编码。而不是：

        String[] parts = line.split("_");
        byte[] bytes = parts[1].getBytes("UTF-8");
        String japaneseString = new String(bytes, "UTF-8");
        Log.d("voc", japaneseString);
        adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
        adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));

认识到由于BufferedReader已经从UTF8解码了line。没有理由将其编码回UTF8只是为了再次解码它。我们还可以通过简单的replaceAll来清除trim内容。

所以将上面改为：

            String[] parts = line.split("_");
            String japaneseString = parts[1].trim();
            String englishString = parts[2].trim();

            Log.d("voc", japaneseString + " : " + englishString);

            adjectives_ej.put(englishString, japaneseString);
            adjectives_je.put(japaneseString, englishString );

UTF-8编码的日语字符不会出现在Android显示屏上

1 个答案: