RuleBasedCollat​​or没有按预期排序,引号中带有特殊字符(破折号)

时间:2015-07-20 20:44:30

标签: java collation

我希望破折号出现在空格之后。但它不起作用。破折号被忽略,即使它们用引号括起来。为什么呢?

  

Text-Argument:text-argument是任何字符序列,   排除特殊字符(即常见的空格字符)   [0009-000D,0020]和规则语法字符[0021-002F,003A-0040,   005B-0060,007B-007E])。如果需要这些字符,你可以放   它们用单引号(例如&ampers =>'&')。请注意,不加引号   空白字符被忽略;例如b c被视为bc。

public static void sortFile() throws IOException, ParseException {
    BufferedReader br = new BufferedReader(
            new InputStreamReader(new FileInputStream("C:\\def.txt"),
                    "Cp1252"));     

    RuleBasedCollator myCollator = (RuleBasedCollator) Collator.getInstance(Locale.US);

    List<String> lines = new ArrayList<String>();
    String line = null;
    while ((line = br.readLine()) != null) {            
        line = line.replace(" ", "' '");
        line = line.replace("\t", "'\t'");
        line = line.replace("-", "'-'");
        lines.add(line);
    }
    br.close();

    Collections.sort(lines, myCollator);

    BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
            new FileOutputStream("C:\\defsorted.txt"), "Cp1252"
        ));     

    for (String line1 : lines) {                
        line1 = line1.replace("' '", " ");
        line1 = line1.replace("'\t'", "\t");
        line1 = line1.replace("'-'", "-");
        out.write(line1 + "\r\n");
    }
    out.close();
}

输入文件:

bottom
bottom antiquark
bottom dollar
bottom-dweller
bottom-dwelling
bottome
bottom feeder
bottomfeeder
bottom-feeder
bottom feeders
bottomfeeders
bottom feeding

已排序的文件:

bottom
bottom antiquark
bottom dollar
bottom-dweller
bottom-dwelling
bottom feeder
bottom-feeder
bottom feeders
bottom feeding
bottome
bottomfeeder
bottomfeeders

预期结果:

bottom
bottom antiquark
bottom dollar
bottom feeder
bottom feeders
bottom feeding
bottom-dweller
bottom-dwelling
bottom-feeder
bottome
bottomfeeder
bottomfeeders

0 个答案:

没有答案