UTF-8编码的日语字符不会出现在Android显示屏上

时间:2016-07-31 17:43:51

标签: java android utf-8

我正在尝试用android studio编写一个词汇应用程序。我有一个txt文件,其中包含UTF-8格式的词汇表。

akarui _ あかるい _ bright

读取文件并添加到字典的代码如下所示:

public Map<String, String> adjectives_ej = new HashMap<String, String>();
try {
        InputStream in = am.open("adjectives_utf8.txt");
        //BufferedReader reader = new BufferedReader(new InputStreamReader(in, "UTF-8"));
        BufferedReader br = new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8));
        StringBuilder sb = new StringBuilder();
        String line;
        while ((line = br.readLine()) != null){
            // printout first line
            if (line != ""){

                String[] parts = line.split("_");
                byte[] bytes = parts[1].getBytes("UTF-8");
                String japaneseString = new String(bytes, "UTF-8");
                Log.d("voc", japaneseString);
                adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
                adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));
            }

        }
TextView textView = new TextView(this);
textView.setText(adjectives_ej.get("bright"));
ViewGroup layout = (ViewGroup)        
findViewById(R.id.activity_adjectives);
layout.addView(textView);

如果我想查看Log.d("test", adjectives_ej.get("bright"));的输出,我会收到错误消息:

java.lang.RuntimeException: Unable to start activity ComponentInfo{ericwolf.genkiii/ericwolf.genkiii.Adjectives}: java.lang.NullPointerException: println needs a message

Log.d("voc", japaneseString);为我提供了正确的输出:07-31 19:42:41.600 25439-25439/ericwolf.genkiii D/voc: くらい

另外在“while”循环中设置textView.setText(parts[1]);也可以。所以我不明白这里的区别。将它保存在字典中是否有问题?

1 个答案:

答案 0 :(得分:1)

感谢您分享txt文件。看起来很好。虽然它确实包含BOM,但我认为这不会导致任何问题。

这是其中一个问题:

  1. 字体问题。也许你用来显示的字体不支持亚洲字符集。

  2. 更有可能的是,UTF8和后面的多重解码/编码。而不是:

            String[] parts = line.split("_");
            byte[] bytes = parts[1].getBytes("UTF-8");
            String japaneseString = new String(bytes, "UTF-8");
            Log.d("voc", japaneseString);
            adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
            adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));
    
  3. 认识到由于BufferedReader已经从UTF8解码了line。没有理由将其编码回UTF8只是为了再次解码它。我们还可以通过简单的replaceAll来清除trim内容。

    所以将上面改为:

                String[] parts = line.split("_");
                String japaneseString = parts[1].trim();
                String englishString = parts[2].trim();
    
                Log.d("voc", japaneseString + " : " + englishString);
    
                adjectives_ej.put(englishString, japaneseString);
                adjectives_je.put(japaneseString, englishString );