Apache POI或java.io是否支持非英文字符?

时间:2015-07-27 07:58:21

标签: java apache-poi

我使用Apache POI从Excel文件中读取以获取docx,doc,xls和xlsx文件的路径,解密文件内容并构建新路径以读取数据。

现在的问题是当路径具有法语字符时,如下所示:

/Valérie/CASES.doxcs
is = new FileInputStream(path);

此行将有以下例外:

(No such file or directory)
at java.io.FileInputStream.open(Native Method)

它适用于其他路径,是指Apache POI不支持非英语字符还是其他错误?无论如何要解决这个问题?

2 个答案:

答案 0 :(得分:1)

由于这是一个操作系统问题,您可以转换路径:

static String toFileName(String name) {
    return java.text.Normalizer.normalize(name, Form.NFKD)
            .replaceAll("\\P{ASCII}", ""); //.replaceAll("[\"/\\]", "_");
}

上述内容会将é转换为e,依此类推,将带重音的字母拆分为基本字母和重音符号。可能会有更好的音译。并考虑西里尔语和其他脚本。

更好的解决方案是迁移到使用UTF-8的Linux系统。您可能仍希望将重音使用标准化为一个唯一形式,例如最短的char序列:

static String toFileName(String name) {
    return java.text.Normalizer.normalize(name, Form.NFKC);
}

答案 1 :(得分:0)

How can I open files containing accents in Java?. tried everything on this link. For most situation, the configuration in Eclipse window->preference->general->workspace set to utf-8, and project-> running as configuration vm Arguments:Dfile.encoding=UTF-8 should already solve the problem.

But if you JDK is not SUN and you are in linux system. You'd better echo $LANG make sure it's UTF-8 and then compile and run the java src code through linux command line.Problem solved. Links for java code run in linux: http://www.sergiy.ca/how-to-compile-and-launch-java-code-from-command-line/