.NET正确地将数字排序为字符串

时间:2012-02-10 16:50:52

标签: .net algorithm math

我有一个应用程序,它显示一个数据集,允许我调用自定义.NET代码,并坚持排序问题。我的数据集中的一列包含字符串和数字数据,我想按字母顺序对字符串和数字进行数字排序。 我所能做的就是获取分拣机正在使用的当前值,并返回一些内容。

如果我的列表是{“ - 6”,“10”,“5”},我想从那些按字母顺序排序的数字中生成字符串。我想出的是让它们全部为正,然后用零填充,如下:

public object Evaluate(object currentValue)
{
    //add 'a' to beginning of non-numbers, 'b' to beginning of numbers so that numbers come second
    string sortOrder = "";
    if(!currentValue.IsNumber)
        sortOrder = "a" + currentValue;
    else
    {
        sortOrder = "b"
        double number = Double.Parse(currentValue);

        //add Double.MaxValue to our number so that we 'hopefully' get rid of negative numbers, but don't go past Double.MaxValue
        number += (Double.MaxValue / 2)

        //pad with zeros so that 5 comes before 10 alphabetically:
        //"0000000005"
        //"0000000010"
        string paddedNumberString = padWithZeros(number.ToString())


        //"b0000000005"
        //"b0000000010"
        sortOrder += paddedNumberString;
    }
}

问题:
如果我只是返回数字,那么它们会按字母顺序排序,10会在5之前出现,我甚至不知道负数会发生什么。

解?:
我想到的一个黑客试图将双打(8字节)转换为无符号长(8字节)。这将消除负数,因为它们将从0开始。但是,5之前的问题仍然存在。为此,也许填0或其他...

这似乎应该是可能的,但我今天愚蠢而且不能聪明。

示例数据:
“猫”
'4'
'5.4'
“狗”
'-400'
“土豚”
'12 .23.34.54'
'我是一个句子' '0'

应该分类到:
'12 .23.34.54'
“土豚”
“猫”
“狗”
'我是一个句子' '-400'
'0'
'4'
'5.4'

4 个答案:

答案 0 :(得分:4)

效率不高,但是一个简单的比较算法首先在数字和非数字之间进行分类,然后在它们之间进行排序 - 请参阅下面的代码。不足之处来自于我们将字符串转换为双倍转换的事实,因此您可以对数字进行预处理(即,将它们的双值存储在List<double?>中)然后使用那些而不是总是进行解析。

public class StackOverflow_9231493
{
    public static void Test()
    {
        List<string> list = new List<string>
        {
            "cat",
             "4",
             "5.4",
             "dog",
             "-400",
             "aardvark",
             "12.23.34.54",
             "i am a sentence",
             "0" ,
        };

        list.Sort(new Comparison<string>(delegate(string s1, string s2)
        {
            double d1, d2;
            bool isNumber1, isNumber2;
            isNumber1 = double.TryParse(s1, out d1);
            isNumber2 = double.TryParse(s2, out d2);
            if (isNumber1 != isNumber2)
            {
                return isNumber2 ? -1 : 1;
            }
            else if (!isNumber1)
            {
                return s1.CompareTo(s2);
            }
            else
            {
                return Math.Sign(d1 - d2);
            }
        }));

        Console.WriteLine(string.Join("\n", list));
    }
}

根据评论进行更新

如果您只想在不使用比较器的情况下直接返回内容,则可以使用相同的逻辑,但将值包装在一个知道如何进行比较的类型中,如下所示。

public class StackOverflow_9231493
{
    public class Wrapper : IComparable<Wrapper>
    {
        internal string value;
        private double? dbl;

        public Wrapper(string value)
        {
            if (value == null) throw new ArgumentNullException("value");
            this.value = value;
            double temp;
            if (double.TryParse(value, out temp))
            {
                dbl = temp;
            }
        }

        public int CompareTo(Wrapper other)
        {
            if (other == null) return -1;
            if (this.dbl.HasValue != other.dbl.HasValue)
            {
                return other.dbl.HasValue ? -1 : 1;
            }
            else if (!this.dbl.HasValue)
            {
                return this.value.CompareTo(other.value);
            }
            else
            {
                return Math.Sign(this.dbl.Value - other.dbl.Value);
            }
        }
    }
    public static void Test()
    {
        List<string> list = new List<string>
        {
            "cat",
             "4",
             "5.4",
             "dog",
             "-400",
             "aardvark",
             "12.23.34.54",
             "i am a sentence",
             "0" ,
        };

        List<Wrapper> list2 = list.Select(x => new Wrapper(x)).ToList();
        list2.Sort();
        Console.WriteLine(string.Join("\n", list2.Select(w => w.value)));
    }
}

答案 1 :(得分:2)

我有一个解决方案,但它需要一个任意的,固定的最大字符串大小,但不需要有关该集的其他信息

首先,按如下方式定义自定义字符集:

public class CustomChar
{
    public static readonly int Base;
    public static readonly int BitsPerChar;

    public char Original { get; private set; }
    public int Target { get; private set; }

    private static readonly Dictionary<char, CustomChar> Translation;

    private static void DefineOrderedCharSet(string charset)
    {
        foreach (var t in charset)
        {
            new CustomChar(t);
        }
    }

    static CustomChar()
    {
        Translation = new Dictionary<char, CustomChar>();
        DefineOrderedCharSet(",-.0123456789 aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ");
        BitsPerChar = (int)Math.Ceiling(Math.Log(Translation.Count, 2));
        Base = (int) Math.Pow(2, BitsPerChar);
    }

    private CustomChar(char original)
    {
        Original = original;

        if(Translation.Count > 0)
        {
            Target = Translation.Max(x => x.Value.Target) + 1;
        }
        else
        {
            Target = 0;
        }

        Translation[original] = this;
    }

    public static CustomChar Parse(char original)
    {
        return Translation[original];
    }
}

然后定义一个构造,用于处理从字符串到System.Numeric.BigInteger的转换,如下所示

public class CustomString
{
    public string String { get; private set; }
    public BigInteger Result { get; private set; }
    public const int MaxChars = 600000;

    public CustomString(string source)
    {
        String = source;
        Result = 0;

        for (var i = 0; i < String.Length; i++)
        {
            var character = CustomChar.Parse(String[i]);
            Result |= (BigInteger)character.Target << (CustomChar.BitsPerChar * (MaxChars - i - 1));
        }

        double doubleValue;

        if (!double.TryParse(source, out doubleValue))
        {
            return;
        }

        Result = new BigInteger(0x7F) << (MaxChars * CustomChar.BitsPerChar);
        var shifted = (BigInteger)(doubleValue * Math.Pow(2, 32));
        Result += shifted;
    }

    public static implicit operator CustomString(string source)
    {
        return new CustomString(source);
    }
}

请注意CustomString的ctor会找到双打并增加他们的BigInteger表示,以组织数值排序。

这是一个相当快速的拼凑,但从测试中得到你描述的输出:

class Program
{
    public static string[] Sort(params CustomString[] strings)
    {
        return strings.OrderBy(x => x.Result).Select(x => x.String).ToArray();
    }

    static void Main()
    {
        var result = Sort(
            "cat",
            "4",
            "5.4",
            "dog",
            "-400",
            "aardvark",
            "12.23.34.54",
            "i am a sentence",
            "0");

        foreach (var str in result)
        {
            Console.WriteLine(str);
        }

        Console.ReadLine();
    }
}

答案 2 :(得分:1)

我怀疑你是在追求一种叫做“自然排序”的东西。 阿特伍德有一个帖子:http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

这篇文章中有几个实现的例子。

答案 3 :(得分:0)

我假设您的数据类型为string而非object。可以使用Comparison<string> delegate调用以下函数。

static int CompareTo(string string1, string string2)
{
    double double1, double2;

    // Add null checks here if necessary...

    if (double.TryParse(string1, out double1))
    {
        if (double.TryParse(string2, out double2))
        {
            // string1 and string2 are both doubles

            return double1.CompareTo(double2);
        }
        else
        {
            // string1 is a double and string2 is text; string2 sorts first

            return 1;
        }
    }
    else if (double.TryParse(string2, out double2))
    {
        // string1 is text and string2 is a double; string1 sorts first

        return -1;
    }
    else
    {
        // string1 and string2 are both text

        return string1.CompareTo(string2);
    }
}

你可以这样测试:

static void Main(string[] args)
{
    var list = new List<string>() {
        "cat",
        "4",
        "5.4",
        "dog",
        "-400",
        "aardvark",
        "12.23.34.54",
        "i am a sentence",
        "0"
    };

    list.Sort(CompareTo);
    foreach (var item in list)
        Console.WriteLine(item);
}