我想要某种字符串比较函数来保留自然排序顺序 1 。 Java中是否有这样的内容?我在String class中找不到任何内容,Comparator class只知道两个实现。
我可以自己动手(这不是一个非常难的问题),但如果我不需要,我宁愿不重新发明轮子。
在我的具体情况下,我有我要排序的软件版本字符串。所以我希望“1.2.10.5”被认为大于“1.2.9.1”。
1 通过“自然”排序顺序,我的意思是它将字符串与人类比较它们的方式进行比较,而不是“ascii-betical”排序顺序只对程序员有意义。换句话说,“image9.jpg”小于“image10.jpg”,“album1set2page9photo1.jpg”小于“album1set2page10photo5.jpg”,“1.2.9.1”小于“1.2.10.5”
答案 0 :(得分:51)
在java中,“自然”顺序意思是“词典”顺序,因此核心中没有像你正在寻找的那样实现。
有开源实现。
这是一个:
请务必阅读:
我希望这有帮助!
答案 1 :(得分:9)
我测试了其他人在这里提到的三个Java实现,发现他们的工作略有不同,但没有像我期望的那样。
AlphaNumericStringComparator和AlphanumComparator都不会忽略空格,因此pic2
位于pic 1
之前。
另一方面,NaturalOrderComparator不仅会忽略空格,还会忽略所有前导零,以便sig[1]
位于sig[0]
之前。
关于性能AlphaNumericStringComparator比其他两个慢了~x10。
答案 2 :(得分:8)
String实现Comparable,这就是Java中的自然顺序(使用类似的接口进行比较)。您可以将字符串放在TreeSet中,也可以使用Collections或Arrays类进行排序。
但是,在您的情况下,您不需要“自然排序”,您真的需要一个自定义比较器,然后您可以在Collections.sort方法或带有比较器的Arrays.sort方法中使用它。
就你想要在比较器中实现的特定逻辑而言(由点分隔的数字)我不知道任何现有的标准实现,但正如你所说,这不是一个难题。 / p>
编辑:在您的评论中,您的链接会获得here,如果您不介意它区分大小写,那么这项工作会很不错。以下是修改后的代码,以便您传入String.CASE_INSENSITIVE_ORDER
:
/*
* The Alphanum Algorithm is an improved sorting algorithm for strings
* containing numbers. Instead of sorting numbers in ASCII order like
* a standard sort, this algorithm sorts numbers in numeric order.
*
* The Alphanum Algorithm is discussed at http://www.DaveKoelle.com
*
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*
*/
import java.util.Comparator;
/**
* This is an updated version with enhancements made by Daniel Migowski,
* Andre Bogus, and David Koelle
*
* To convert to use Templates (Java 1.5+):
* - Change "implements Comparator" to "implements Comparator<String>"
* - Change "compare(Object o1, Object o2)" to "compare(String s1, String s2)"
* - Remove the type checking and casting in compare().
*
* To use this class:
* Use the static "sort" method from the java.util.Collections class:
* Collections.sort(your list, new AlphanumComparator());
*/
public class AlphanumComparator implements Comparator<String>
{
private Comparator<String> comparator = new NaturalComparator();
public AlphanumComparator(Comparator<String> comparator) {
this.comparator = comparator;
}
public AlphanumComparator() {
}
private final boolean isDigit(char ch)
{
return ch >= 48 && ch <= 57;
}
/** Length of string is passed in for improved efficiency (only need to calculate it once) **/
private final String getChunk(String s, int slength, int marker)
{
StringBuilder chunk = new StringBuilder();
char c = s.charAt(marker);
chunk.append(c);
marker++;
if (isDigit(c))
{
while (marker < slength)
{
c = s.charAt(marker);
if (!isDigit(c))
break;
chunk.append(c);
marker++;
}
} else
{
while (marker < slength)
{
c = s.charAt(marker);
if (isDigit(c))
break;
chunk.append(c);
marker++;
}
}
return chunk.toString();
}
public int compare(String s1, String s2)
{
int thisMarker = 0;
int thatMarker = 0;
int s1Length = s1.length();
int s2Length = s2.length();
while (thisMarker < s1Length && thatMarker < s2Length)
{
String thisChunk = getChunk(s1, s1Length, thisMarker);
thisMarker += thisChunk.length();
String thatChunk = getChunk(s2, s2Length, thatMarker);
thatMarker += thatChunk.length();
// If both chunks contain numeric characters, sort them numerically
int result = 0;
if (isDigit(thisChunk.charAt(0)) && isDigit(thatChunk.charAt(0)))
{
// Simple chunk comparison by length.
int thisChunkLength = thisChunk.length();
result = thisChunkLength - thatChunk.length();
// If equal, the first different number counts
if (result == 0)
{
for (int i = 0; i < thisChunkLength; i++)
{
result = thisChunk.charAt(i) - thatChunk.charAt(i);
if (result != 0)
{
return result;
}
}
}
} else
{
result = comparator.compare(thisChunk, thatChunk);
}
if (result != 0)
return result;
}
return s1Length - s2Length;
}
private static class NaturalComparator implements Comparator<String> {
public int compare(String o1, String o2) {
return o1.compareTo(o2);
}
}
}
答案 3 :(得分:6)
看看这个实现。它应该尽可能快,没有任何正则表达式或数组操作或方法调用,只需要几个标志和很多情况。
这应该对字符串中的任何数字组合进行排序,并正确支持相等的数字并继续前进。
public static int naturalCompare(String a, String b, boolean ignoreCase) {
if (ignoreCase) {
a = a.toLowerCase();
b = b.toLowerCase();
}
int aLength = a.length();
int bLength = b.length();
int minSize = Math.min(aLength, bLength);
char aChar, bChar;
boolean aNumber, bNumber;
boolean asNumeric = false;
int lastNumericCompare = 0;
for (int i = 0; i < minSize; i++) {
aChar = a.charAt(i);
bChar = b.charAt(i);
aNumber = aChar >= '0' && aChar <= '9';
bNumber = bChar >= '0' && bChar <= '9';
if (asNumeric)
if (aNumber && bNumber) {
if (lastNumericCompare == 0)
lastNumericCompare = aChar - bChar;
} else if (aNumber)
return 1;
else if (bNumber)
return -1;
else if (lastNumericCompare == 0) {
if (aChar != bChar)
return aChar - bChar;
asNumeric = false;
} else
return lastNumericCompare;
else if (aNumber && bNumber) {
asNumeric = true;
if (lastNumericCompare == 0)
lastNumericCompare = aChar - bChar;
} else if (aChar != bChar)
return aChar - bChar;
}
if (asNumeric)
if (aLength > bLength && a.charAt(bLength) >= '0' && a.charAt(bLength) <= '9') // as number
return 1; // a has bigger size, thus b is smaller
else if (bLength > aLength && b.charAt(aLength) >= '0' && b.charAt(aLength) <= '9') // as number
return -1; // b has bigger size, thus a is smaller
else if (lastNumericCompare == 0)
return aLength - bLength;
else
return lastNumericCompare;
else
return aLength - bLength;
}
答案 4 :(得分:2)
如何使用String中的split()方法,解析单个数字字符串,然后逐个进行比较?
@Test
public void test(){
System.out.print(compare("1.12.4".split("\\."), "1.13.4".split("\\."),0));
}
public static int compare(String[] arr1, String[] arr2, int index){
// if arrays do not have equal size then and comparison reached the upper bound of one of them
// then the longer array is considered the bigger ( --> 2.2.0 is bigger then 2.2)
if(arr1.length <= index || arr2.length <= index) return arr1.length - arr2.length;
int result = Integer.parseInt(arr1[index]) - Integer.parseInt(arr2[index]);
return result == 0 ? compare(arr1, arr2, ++index) : result;
}
我没有检查角落的情况,但这应该有用,而且非常紧凑
答案 5 :(得分:1)
它汇总数字,然后比较它。如果它不适用它继续。
public int compare(String o1, String o2) {
if(o1 == null||o2 == null)
return 0;
for(int i = 0; i<o1.length()&&i<o2.length();i++){
if(Character.isDigit(o1.charAt(i)) || Character.isDigit(o2.charAt(i)))
{
String dig1 = "",dig2 = "";
for(int x = i; x<o1.length() && Character.isDigit(o1.charAt(i)); x++){
dig1+=o1.charAt(x);
}
for(int x = i; x<o2.length() && Character.isDigit(o2.charAt(i)); x++){
dig2+=o2.charAt(x);
}
if(Integer.valueOf(dig1) < Integer.valueOf(dig2))
return -1;
if(Integer.valueOf(dig1) > Integer.valueOf(dig2))
return 1;
}
if(o1.charAt(i)<o2.charAt(i))
return -1;
if(o1.charAt(i)>o2.charAt(i))
return 1;
}
return 0;
}
答案 6 :(得分:0)
可能是迟到的回复。但我的回答可以帮助那些需要这样的比较器的人。
我也验证了其他几个比较器。但我认为比我比较的其他人有点效率。还尝试了Yishai发布的那个。对于100个条目的字母数字数据集,我的上传时间只有一半。
/**
* Sorter that compares the given Alpha-numeric strings. This iterates through each characters to
* decide the sort order. There are 3 possible cases while iterating,
*
* <li>If both have same non-digit characters then the consecutive characters will be considered for
* comparison.</li>
*
* <li>If both have numbers at the same position (with/without non-digit characters) the consecutive
* digit characters will be considered to form the valid integer representation of the characters
* will be taken and compared.</li>
*
* <li>At any point if the comparison gives the order(either > or <) then the consecutive characters
* will not be considered.</li>
*
* For ex., this will be the ordered O/P of the given list of Strings.(The bold characters decides
* its order) <i><b>2</b>b,<b>100</b>b,a<b>1</b>,A<b>2</b>y,a<b>100</b>,</i>
*
* @author kannan_r
*
*/
class AlphaNumericSorter implements Comparator<String>
{
/**
* Does the Alphanumeric sort of the given two string
*/
public int compare(String theStr1, String theStr2)
{
char[] theCharArr1 = theStr1.toCharArray();
char[] theCharArr2 = theStr2.toCharArray();
int aPosition = 0;
if (Character.isDigit(theCharArr1[aPosition]) && Character.isDigit(theCharArr2[aPosition]))
{
return sortAsNumber(theCharArr1, theCharArr2, aPosition++ );
}
return sortAsString(theCharArr1, theCharArr2, 0);
}
/**
* Sort the given Arrays as string starting from the given position. This will be a simple case
* insensitive sort of each characters. But at any given position if there are digits in both
* arrays then the method sortAsNumber will be invoked for the given position.
*
* @param theArray1 The first character array.
* @param theArray2 The second character array.
* @param thePosition The position starting from which the calculation will be done.
* @return positive number when the Array1 is greater than Array2<br/>
* negative number when the Array2 is greater than Array1<br/>
* zero when the Array1 is equal to Array2
*/
private int sortAsString(char[] theArray1, char[] theArray2, int thePosition)
{
int aResult = 0;
if (thePosition < theArray1.length && thePosition < theArray2.length)
{
aResult = (int)theArray1[thePosition] - (int)theArray2[thePosition];
if (aResult == 0)
{
++thePosition;
if (thePosition < theArray1.length && thePosition < theArray2.length)
{
if (Character.isDigit(theArray1[thePosition]) && Character.isDigit(theArray2[thePosition]))
{
aResult = sortAsNumber(theArray1, theArray2, thePosition);
}
else
{
aResult = sortAsString(theArray1, theArray2, thePosition);
}
}
}
}
else
{
aResult = theArray1.length - theArray2.length;
}
return aResult;
}
/**
* Sorts the characters in the given array as number starting from the given position. When
* sorted as numbers the consecutive characters starting from the given position upto the first
* non-digit character will be considered.
*
* @param theArray1 The first character array.
* @param theArray2 The second character array.
* @param thePosition The position starting from which the calculation will be done.
* @return positive number when the Array1 is greater than Array2<br/>
* negative number when the Array2 is greater than Array1<br/>
* zero when the Array1 is equal to Array2
*/
private int sortAsNumber(char[] theArray1, char[] theArray2, int thePosition)
{
int aResult = 0;
int aNumberInStr1;
int aNumberInStr2;
if (thePosition < theArray1.length && thePosition < theArray2.length)
{
if (Character.isDigit(theArray1[thePosition]) && Character.isDigit(theArray1[thePosition]))
{
aNumberInStr1 = getNumberInStr(theArray1, thePosition);
aNumberInStr2 = getNumberInStr(theArray2, thePosition);
aResult = aNumberInStr1 - aNumberInStr2;
if (aResult == 0)
{
thePosition = getNonDigitPosition(theArray1, thePosition);
if (thePosition != -1)
{
aResult = sortAsString(theArray1, theArray2, thePosition);
}
}
}
else
{
aResult = sortAsString(theArray1, theArray2, ++thePosition);
}
}
else
{
aResult = theArray1.length - theArray2.length;
}
return aResult;
}
/**
* Gets the position of the non digit character in the given array starting from the given
* position.
*
* @param theCharArr /the character array.
* @param thePosition The position after which the array need to be checked for non-digit
* character.
* @return The position of the first non-digit character in the array.
*/
private int getNonDigitPosition(char[] theCharArr, int thePosition)
{
for (int i = thePosition; i < theCharArr.length; i++ )
{
if ( !Character.isDigit(theCharArr[i]))
{
return i;
}
}
return -1;
}
/**
* Gets the integer value of the number starting from the given position of the given array.
*
* @param theCharArray The character array.
* @param thePosition The position form which the number need to be calculated.
* @return The integer value of the number.
*/
private int getNumberInStr(char[] theCharArray, int thePosition)
{
int aNumber = 0;
for (int i = thePosition; i < theCharArray.length; i++ )
{
if(!Character.isDigit(theCharArray[i]))
{
return aNumber;
}
aNumber += aNumber * 10 + (theCharArray[i] - 48);
}
return aNumber;
}
}
答案 7 :(得分:0)
使用RuleBasedCollator
也可能是一种选择。虽然您必须提前添加所有排序顺序规则,但如果您想要考虑更大的数字,这也不是一个好的解决方案。
添加2 < 10
等特定自定义设置非常简单,可能对排序Trusty < Precise < Xenial < Yakkety
等特殊版本标识符非常有用。
RuleBasedCollator localRules = (RuleBasedCollator) Collator.getInstance();
String extraRules = IntStream.range(0, 100).mapToObj(String::valueOf).collect(joining(" < "));
RuleBasedCollator c = new RuleBasedCollator(localRules.getRules() + " & " + extraRules);
List<String> a = asList("1-2", "1-02", "1-20", "10-20", "fred", "jane", "pic01", "pic02", "pic02a", "pic 5", "pic05", "pic 7", "pic100", "pic100a", "pic120", "pic121");
shuffle(a);
a.sort(c);
System.out.println(a);