我正在尝试使用Sander Pham在另一个问题上提出的代码。我需要像Windows资源管理器那样对字符串名称的java ArrayList进行排序。他的代码适用于一切,但只针对一个问题。我本来希望对这个问题发表评论,但我需要更多的声誉点来评论。无论如何......他建议使用自定义比较器实现的类并使用它来比较字符串名称。这是该类的代码:
class IntuitiveStringComparator implements Comparator<String>
{
private String str1, str2;
private int pos1, pos2, len1, len2;
public int compare(String s1, String s2)
{
str1 = s1;
str2 = s2;
len1 = str1.length();
len2 = str2.length();
pos1 = pos2 = 0;
int result = 0;
while (result == 0 && pos1 < len1 && pos2 < len2)
{
char ch1 = str1.charAt(pos1);
char ch2 = str2.charAt(pos2);
if (Character.isDigit(ch1))
{
result = Character.isDigit(ch2) ? compareNumbers() : -1;
}
else if (Character.isLetter(ch1))
{
result = Character.isLetter(ch2) ? compareOther(true) : 1;
}
else
{
result = Character.isDigit(ch2) ? 1
: Character.isLetter(ch2) ? -1
: compareOther(false);
}
pos1++;
pos2++;
}
return result == 0 ? len1 - len2 : result;
}
private int compareNumbers()
{
// Find out where the digit sequence ends, save its length for
// later use, then skip past any leading zeroes.
int end1 = pos1 + 1;
while (end1 < len1 && Character.isDigit(str1.charAt(end1)))
{
end1++;
}
int fullLen1 = end1 - pos1;
while (pos1 < end1 && str1.charAt(pos1) == '0')
{
pos1++;
}
// Do the same for the second digit sequence.
int end2 = pos2 + 1;
while (end2 < len2 && Character.isDigit(str2.charAt(end2)))
{
end2++;
}
int fullLen2 = end2 - pos2;
while (pos2 < end2 && str2.charAt(pos2) == '0')
{
pos2++;
}
// If the remaining subsequences have different lengths,
// they can't be numerically equal.
int delta = (end1 - pos1) - (end2 - pos2);
if (delta != 0)
{
return delta;
}
// We're looking at two equal-length digit runs; a sequential
// character comparison will yield correct results.
while (pos1 < end1 && pos2 < end2)
{
delta = str1.charAt(pos1++) - str2.charAt(pos2++);
if (delta != 0)
{
return delta;
}
}
pos1--;
pos2--;
// They're numerically equal, but they may have different
// numbers of leading zeroes. A final length check will tell.
return fullLen2 - fullLen1;
}
private int compareOther(boolean isLetters)
{
char ch1 = str1.charAt(pos1);
char ch2 = str2.charAt(pos2);
if (ch1 == ch2)
{
return 0;
}
if (isLetters)
{
ch1 = Character.toUpperCase(ch1);
ch2 = Character.toUpperCase(ch2);
if (ch1 != ch2)
{
ch1 = Character.toLowerCase(ch1);
ch2 = Character.toLowerCase(ch2);
}
}
return ch1 - ch2;
}
}
在使用它时,除非字符串名称之后没有数字,否则它的效果很好。如果它没有数字,则将其放在列表的末尾,这是错误的。如果它没有数字,它应该在开头。
即
filename.jpg
filename2.jpg
filename03.jpg
filename3.jpg
目前它排序......
filename2.jpg
filename03.jpg
filename3.jpg
filename.jpg
我需要更改代码才能更正此行为?
由于
答案 0 :(得分:6)
这是我第二次尝试回答这个问题。我用http://www.interact-sw.co.uk/iangblog/2007/12/13/natural-sorting作为开始。不幸的是,我觉得我也发现了问题。但我认为在我的代码中这些问题得到了正确的解决。
信息:Windows资源管理器使用API函数StrCmpLogicalW()
函数进行排序。它被称为自然排序。
所以这是我对WindowsExplorerSort的解读 - 算法:
此列表部分基于尝试和错误。我增加了测试文件名的数量,以便在评论中提到更多错误,并在Windows资源管理器中检查结果。
所以这是输出:
filename
filename 00
filename 0
filename 01
filename.jpg
filename.txt
filename00.jpg
filename00a.jpg
filename00a.txt
filename0
filename0.jpg
filename0a.txt
filename0b.jpg
filename0b1.jpg
filename0b02.jpg
filename0c.jpg
filename01.0hjh45-test.txt
filename01.0hjh46
filename01.1hjh45.txt
filename01.hjh45.txt
Filename01.jpg
Filename1.jpg
filename2.hjh45.txt
filename2.jpg
filename03.jpg
filename3.jpg
新比较器WindowsExplorerComparator
在已经提到的部分中拆分文件名,并对两个文件名进行部分比较。为了正确,新的比较器使用字符串作为输入,因此必须创建一个适配器Comparator,如
new Comparator<File>() {
private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();
@Override
public int compare(File o1, File o2) {;
return NATURAL_SORT.compare(o1.getName(), o2.getName());
}
}
所以这里是新的Comparators源代码及其测试:
import java.io.File;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.Iterator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class WindowsSorter {
public static void main(String args[]) {
//huge test data set ;)
List<File> filenames = Arrays.asList(new File[]{new File("Filename01.jpg"),
new File("filename"), new File("filename0"), new File("filename 0"),
new File("Filename1.jpg"), new File("filename.jpg"), new File("filename2.jpg"),
new File("filename03.jpg"), new File("filename3.jpg"), new File("filename00.jpg"),
new File("filename0.jpg"), new File("filename0b.jpg"), new File("filename0b1.jpg"),
new File("filename0b02.jpg"), new File("filename0c.jpg"), new File("filename00a.jpg"),
new File("filename.txt"), new File("filename00a.txt"), new File("filename0a.txt"),
new File("filename01.0hjh45-test.txt"), new File("filename01.0hjh46"),
new File("filename2.hjh45.txt"), new File("filename01.1hjh45.txt"),
new File("filename01.hjh45.txt"), new File("filename 01"),
new File("filename 00")});
//adaptor for comparing files
Collections.sort(filenames, new Comparator<File>() {
private final Comparator<String> NATURAL_SORT = new WindowsExplorerComparator();
@Override
public int compare(File o1, File o2) {;
return NATURAL_SORT.compare(o1.getName(), o2.getName());
}
});
for (File f : filenames) {
System.out.println(f);
}
}
public static class WindowsExplorerComparator implements Comparator<String> {
private static final Pattern splitPattern = Pattern.compile("\\d+|\\.|\\s");
@Override
public int compare(String str1, String str2) {
Iterator<String> i1 = splitStringPreserveDelimiter(str1).iterator();
Iterator<String> i2 = splitStringPreserveDelimiter(str2).iterator();
while (true) {
//Til here all is equal.
if (!i1.hasNext() && !i2.hasNext()) {
return 0;
}
//first has no more parts -> comes first
if (!i1.hasNext() && i2.hasNext()) {
return -1;
}
//first has more parts than i2 -> comes after
if (i1.hasNext() && !i2.hasNext()) {
return 1;
}
String data1 = i1.next();
String data2 = i2.next();
int result;
try {
//If both datas are numbers, then compare numbers
result = Long.compare(Long.valueOf(data1), Long.valueOf(data2));
//If numbers are equal than longer comes first
if (result == 0) {
result = -Integer.compare(data1.length(), data2.length());
}
} catch (NumberFormatException ex) {
//compare text case insensitive
result = data1.compareToIgnoreCase(data2);
}
if (result != 0) {
return result;
}
}
}
private List<String> splitStringPreserveDelimiter(String str) {
Matcher matcher = splitPattern.matcher(str);
List<String> list = new ArrayList<String>();
int pos = 0;
while (matcher.find()) {
list.add(str.substring(pos, matcher.start()));
list.add(matcher.group());
pos = matcher.end();
}
list.add(str.substring(pos));
return list;
}
}
}
答案 1 :(得分:0)
在比较方法中切换第-1和第1的符号:
if (Character.isDigit(ch1))
{
result = Character.isDigit(ch2) ? compareNumbers() : 1;
}
else if (Character.isLetter(ch1))
{
result = Character.isLetter(ch2) ? compareOther(true) : 1;
}
当第一个字符串有一个数字而第二个字符串没有数字,或者第一个字符串没有,但第二个字符串不存在时,它们确定排序。
答案 2 :(得分:0)
如果您要排序的内容是或者可以表示为文件集合,您可能需要查看Apache Commons IO库NameFileComparator类。这提供了几个预先构建的比较器,您可以利用这些比较器来完成您正在寻找的内容。例如,NAME_INSENSITIVE_COMPARATOR应该做你想做的事。
List<File> filenames = Arrays.asList(new File[] {
new File("Filename01.jpg"),
new File("Filename1.jpg"),
new File("filename.jpg"),
new File("filename2.jpg"),
new File("filename03.jpg"),
new File("filename3.jpg")});
Collections.sort(filenames, NameFileComparator.NAME_INSENSITIVE_COMPARATOR);
for (File f : filenames) {
System.out.println(f);
}
输出:
filename.jpg
Filename01.jpg
filename03.jpg
Filename1.jpg
filename2.jpg
filename3.jpg
答案 3 :(得分:0)
刚刚从评论中完成我的建议。这是一个恕我直言更好的比较器可读版本(希望)按您需要的方式排序。主要逻辑就像我建议的那样:
//Compare the namepart caseinsensitive.
int result = data1.name.compareToIgnoreCase(data2.name);
//If name is equal, then compare by number
if (result == 0) {
result = data1.number.compareTo(data2.number);
}
//If numbers are equal then compare by length text of number. This
//is valid because it differs only by heading zeros. Longer comes
//first.
if (result == 0) {
result = -Integer.compare(data1.numberText.length(), data2.numberText.length());
}
//If all above is equal, compare by ext.
if (result == 0) {
result = data1.ext.compareTo(data2.ext);
}
如您所见,这是一个动态版本,无需任何假设即可处理名称和扩展。我在这个小测试程序中包含了您的第一个和您在评论中添加的测试数据。
所以这是测试数据的排序输出:
filename.jpg
filename00.jpg
filename0.jpg
Filename01.jpg
Filename1.jpg
filename2.jpg
filename03.jpg
filename3.jpg
filename0b.jpg
filename0b1.jpg
filename0b02.jpg
filename0c.jpg
最后但并非最不重要的完整代码:
import java.io.File;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class WindowsSorter {
public static void main(String args[]) {
List<File> filenames = Arrays.asList(new File[]{new File("Filename01.jpg"),
new File("Filename1.jpg"), new File("filename.jpg"), new File("filename2.jpg"),
new File("filename03.jpg"), new File("filename3.jpg"), new File("filename00.jpg"),
new File("filename0.jpg"), new File("filename0b.jpg"), new File("filename0b1.jpg"),
new File("filename0b02.jpg"), new File("filename0c.jpg")});
Collections.sort(filenames, new WindowsLikeComparator());
for (File f : filenames) {
System.out.println(f);
}
}
private static class WindowsLikeComparator implements Comparator<File> {
//Regexp to make the 3 part split of the filename.
private static final Pattern splitPattern = Pattern.compile("^(.*?)(\\d*)(?:\\.([^.]*))?$");
@Override
public int compare(File o1, File o2) {
SplitteFileName data1 = getSplittedFileName(o1);
SplitteFileName data2 = getSplittedFileName(o2);
//Compare the namepart caseinsensitive.
int result = data1.name.compareToIgnoreCase(data2.name);
//If name is equal, then compare by number
if (result == 0) {
result = data1.number.compareTo(data2.number);
}
//If numbers are equal then compare by length text of number. This
//is valid because it differs only by heading zeros. Longer comes
//first.
if (result == 0) {
result = -Integer.compare(data1.numberText.length(), data2.numberText.length());
}
//If all above is equal, compare by ext.
if (result == 0) {
result = data1.ext.compareTo(data2.ext);
}
return result;
}
private SplitteFileName getSplittedFileName(File f) {
Matcher matcher = splitPattern.matcher(f.getName());
if (matcher.matches()) {
return new SplitteFileName(matcher.group(1), matcher.group(2), matcher.group(3));
} else {
return new SplitteFileName(f.getName(), null, null);
}
}
static class SplitteFileName {
String name;
Long number;
String numberText;
String ext;
public SplitteFileName(String name, String numberText, String ext) {
this.name = name;
if ("".equals(numberText)) {
this.number = -1L;
} else {
this.number = Long.valueOf(numberText);
}
this.numberText = numberText;
this.ext = ext;
}
}
}
}
编辑1: 该算法已更改为地址filename00,filename0排序问题。
编辑2: 深入研究Windows Explorers排序算法后,很明显,这个答案确实是原始帖子和测试数据的解决方案 - 这就是为什么我不会删除它 - 但不是模仿Windows资源管理器行为的完整解决方案。因此,我将提供另一个希望更完整的解决方案。
答案 4 :(得分:0)
使用操作系统本机调用的仅Windows解决方案:https://stackoverflow.com/a/60099813/4494577
在Windows中按名称排序是棘手的,并且比您的实现复杂得多。它也是可配置的并且取决于版本。
注意:我为此帖子创建了一个演示。 Check it out on GitHub。
使用StrCmpLogicalWComparator function排序文件名
根据某些(例如here),Windows使用StrCmpLogicalW按名称对文件进行排序。
您可以尝试通过使用JNA调用此系统函数来实现比较器(不要忘记在项目中包含JNA library)
比较器:
public class StrCmpLogicalWComparator implements Comparator<String> { @Override public int compare(String o1, String o2) { return Shlwapi.INSTANCE.StrCmpLogicalW( new WString(o1), new WString(o2)); } }
JNA部分:
import com.sun.jna.WString; import com.sun.jna.win32.StdCallLibrary; public interface Shlwapi extends StdCallLibrary { Shlwapi INSTANCE = Native.load("Shlwapi", Shlwapi.class); int StrCmpLogicalW(WString psz1, WString psz2); }
处理包含数字的文件名
我之前提到Windows资源管理器对文件排序的方式是可配置的。您可以更改文件名中数字的处理方式,并切换所谓的“数字排序”。您可以阅读如何配置此here。如文档中所述的数字排序:
在排序过程中将数字视为数字,例如,在“ 10”之前将“ 2”排序。
启用数字排序后,结果为:
在禁用数字排序的情况下,它看起来像这样:
这让我认为Windows资源管理器实际上使用CompareStringEx function进行排序,可以对其进行参数化以启用此功能。
使用CompareStringEx function排序文件名
JNA部分:
import com.sun.jna.Pointer; import com.sun.jna.WString; import com.sun.jna.win32.StdCallLibrary; public interface Kernel32 extends StdCallLibrary { Kernel32 INSTANCE = Native.load("Kernel32", Kernel32.class); WString INVARIANT_LOCALE = new WString(""); int CompareStringEx(WString lpLocaleName, int dwCmpFlags, WString lpString1, int cchCount1, WString lpString2, int cchCount2, Pointer lpVersionInformation, Pointer lpReserved, int lParam); default int CompareStringEx(int dwCmpFlags, String str1, String str2) { return Kernel32.INSTANCE .CompareStringEx( INVARIANT_LOCALE, dwCmpFlags, new WString(str1), str1.length(), new WString(str2), str2.length(), Pointer.NULL, Pointer.NULL, 0); } }
数值排序比较器:
public class CompareStringExNumericComparator implements Comparator<String> { private static int SORT_DIGITSASNUMBERS = 0x00000008; @Override public int compare(String o1, String o2) { int compareStringExComparisonResult = Kernel32.INSTANCE.CompareStringEx(SORT_DIGITSASNUMBERS, o1, o2); // CompareStringEx returns 1, 2, 3 respectively instead of -1, 0, 1 return compareStringExComparisonResult - 2; } }
非数字排序比较器:
public class CompareStringExNonNumericComparator implements Comparator<String> { private static String INVARIANT_LOCALE = ""; private static int NO_OPTIONS = 0; @Override public int compare(String o1, String o2) { int compareStringExComparisonResult = Kernel32.INSTANCE.CompareStringEx(NO_OPTIONS, o1, o2); // CompareStringEx returns 1, 2, 3 respectively instead of -1, 0, 1 return compareStringExComparisonResult - 2; } }
参考