我正在尝试用android studio编写一个词汇应用程序。我有一个txt文件,其中包含UTF-8格式的词汇表。
akarui _ あかるい _ bright
读取文件并添加到字典的代码如下所示:
public Map<String, String> adjectives_ej = new HashMap<String, String>();
try {
InputStream in = am.open("adjectives_utf8.txt");
//BufferedReader reader = new BufferedReader(new InputStreamReader(in, "UTF-8"));
BufferedReader br = new BufferedReader(new InputStreamReader(in, StandardCharsets.UTF_8));
StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null){
// printout first line
if (line != ""){
String[] parts = line.split("_");
byte[] bytes = parts[1].getBytes("UTF-8");
String japaneseString = new String(bytes, "UTF-8");
Log.d("voc", japaneseString);
adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));
}
}
TextView textView = new TextView(this);
textView.setText(adjectives_ej.get("bright"));
ViewGroup layout = (ViewGroup)
findViewById(R.id.activity_adjectives);
layout.addView(textView);
如果我想查看Log.d("test", adjectives_ej.get("bright"));
的输出,我会收到错误消息:
java.lang.RuntimeException: Unable to start activity ComponentInfo{ericwolf.genkiii/ericwolf.genkiii.Adjectives}: java.lang.NullPointerException: println needs a message
但Log.d("voc", japaneseString);
为我提供了正确的输出:07-31 19:42:41.600 25439-25439/ericwolf.genkiii D/voc: くらい
另外在“while”循环中设置textView.setText(parts[1]);
也可以。所以我不明白这里的区别。将它保存在字典中是否有问题?
答案 0 :(得分:1)
感谢您分享txt文件。看起来很好。虽然它确实包含BOM,但我认为这不会导致任何问题。
这是其中一个问题:
字体问题。也许你用来显示的字体不支持亚洲字符集。
更有可能的是,UTF8和后面的多重解码/编码。而不是:
String[] parts = line.split("_");
byte[] bytes = parts[1].getBytes("UTF-8");
String japaneseString = new String(bytes, "UTF-8");
Log.d("voc", japaneseString);
adjectives_ej.put(parts[2].replaceAll(" ",""), new String(bytes, "UTF-8"));
adjectives_je.put(new String(bytes, "UTF-8"), parts[2].replaceAll(" ",""));
认识到由于BufferedReader已经从UTF8解码了line
。没有理由将其编码回UTF8只是为了再次解码它。我们还可以通过简单的replaceAll
来清除trim
内容。
所以将上面改为:
String[] parts = line.split("_");
String japaneseString = parts[1].trim();
String englishString = parts[2].trim();
Log.d("voc", japaneseString + " : " + englishString);
adjectives_ej.put(englishString, japaneseString);
adjectives_je.put(japaneseString, englishString );