我正在尝试找到一种对列表进行尽快排序的方法。我知道我将使用的“存储桶排序”(我认为吗?)。但是在那之前。我想我首先要寻找最快的清理算法,然后在存储桶排序中使用它?
字符串如下所示,我在一个循环中添加了100.000个元素:
-0,awb/aje - ddfas/asa - asoo/qwa
-1,awb/aje - ddfas/asa - asoo/qwa
-2,awb/aje - ddfas/asa - asoo/qwa
因此,我想对第一个以逗号分隔的参数按降序排序,该参数是双精度的,如-0,-1,-2等。
我尝试了3种方法,其中只有方法1实际上正确排序。方法2和3不能完全正确地按数字降序排序。
因此,正确排序的方法1在30秒内完成了此操作。 事实是,我将拥有大约300万个元素,而不仅仅是本示例中的100.000个元素,这将花费至少900秒或更长时间。
我的问题是,我们如何才能尽快对100.000个或更正确的300万个元素进行排序? 正在运行:sortingtestBENCHMARKS()将显示结果
public void sortingtestBENCHMARKS()
{
List<String> dataLIST = new List<String>(); List<String> sortedLIST = new List<String>(); String resultString = ""; int index = 0;
DateTime starttime = DateTime.Now; DateTime endtime = DateTime.Now; TimeSpan span = new TimeSpan();
for (double i = 0; i < 100000; i+=1)
{
dataLIST.Add("-" + i + "," + "awb/aje" + " - " + "ddfas/asa" + " - " + "asoo/qwa");
}
dataLIST = shuffle(dataLIST);
/*--------------------------------------------------------------------------*/
//APPROACH 1: 30 seconds (Sorts correctly in descending order)
starttime = DateTime.Now;
dataLIST = sortLIST(dataLIST);
endtime = DateTime.Now;
span = endtime - starttime;
resultString = "Approach 1: " + span.TotalSeconds;
dataLIST = shuffle(dataLIST);
/*--------------------------------------------------------------------------*/
//APPROACH 2: 55 seconds (Sorts INcorrectly in descending order)
starttime = DateTime.Now;
for (int i = 0; i < dataLIST.Count; i++)
{
index = sortedLIST.BinarySearch(dataLIST[i]);
if (index < 0)
{
sortedLIST.Insert(~index, dataLIST[i]);
}
}
endtime = DateTime.Now;
span = endtime - starttime;
resultString = resultString + "\nApproach 2: " + span.TotalSeconds;
/*--------------------------------------------------------------------------*/
//APPROACH 3: 2 seconds (Sorts INcorrectly in descending order)
starttime = DateTime.Now;
dataLIST.Sort(); //1.6 seconds
endtime = DateTime.Now;
span = endtime - starttime;
resultString = resultString + "\nApproach 3: " + span.TotalSeconds;
/*--------------------------------------------------------------------------*/
MessageBox.Show("Elapsed Times:\n\n" + resultString);
}
List<String> sortLIST(List<String> theLIST)
{
System.Threading.Thread.CurrentThread.CurrentCulture = new System.Globalization.CultureInfo("en-US");
theLIST.Sort(new Comparison<String>((a, b) =>
{
int result = 0;
double ad = 0;
double bd = 0;
NumberFormatInfo provider = new NumberFormatInfo();
provider.NumberGroupSeparator = ",";
provider.NumberDecimalSeparator = ".";
provider.NumberGroupSizes = new int[] { 5 };
ad = Convert.ToDouble(a.Replace("a", "").Replace("c", "").Split(',')[0], provider);
bd = Convert.ToDouble(b.Replace("a", "").Replace("c", "").Split(',')[0], provider);
if (ad < bd)
{
result = 1;
}
else if (ad > bd)
{
result = -1;
}
return result;
}));
return theLIST;
}
List<String> shuffle(List<String> list)
{
var randomizedList = new List<String>();
var rnd = new Random();
while (list.Count != 0)
{
var index = rnd.Next(0, list.Count);
randomizedList.Add(list[index]);
list.RemoveAt(index);
}
return randomizedList;
}
答案 0 :(得分:1)
在我看来,您可以将字符串拆分为,
字符,将-
剥离掉拆分数组中的第一项,然后对结果使用OrderBy
:
var sorted = dataLIST.OrderBy(i => double.Parse(i.Split(',')[0].TrimStart('-'))).ToList();
我制作了您的代码的副本,然后使用了您拥有的一种工作方法,并将其与在上述拆分字符串方法上运行OrderBy
进行了比较。 OrderBy
/ Split
方法要快30倍以上。
public static void sortingtestBENCHMARKS()
{
var dataLIST = new List<string>();
// Create the list
for (var i = 0; i < 100000; i ++)
{
dataLIST.Add("-" + i + "," + "awb/aje" + " - " + "ddfas/asa" + " - " + "asoo/qwa");
}
// Shuffle the list
dataLIST = shuffle(dataLIST);
// Make two copies of the same shuffled list
var copy1 = dataLIST.ToList();
var copy2 = dataLIST.ToList();
// Use a stopwatch for measuring time when benchmark testing
var stopwatch = new Stopwatch();
/*--------------------------------------------------------------------------*/
//APPROACH 1: 2.83 seconds (Sorts correctly in descending order)
stopwatch.Start();
copy2 = sortLIST(copy2);
stopwatch.Stop();
Console.WriteLine($"sortLIST method: {stopwatch.Elapsed.TotalSeconds} seconds");
/*--------------------------------------------------------------------------*/
//APPROACH 2: 0.09 seconds (Sorts correctly in descending order)
stopwatch.Restart();
copy1 = copy1.OrderBy(i => double.Parse(i.Split(',')[0].TrimStart('-'))).ToList();
stopwatch.Stop();
Console.WriteLine($"OrderBy method: {stopwatch.Elapsed.TotalSeconds} seconds");
// Ensure that the lists are sorted identically
Console.WriteLine($"Lists are the same: {copy1.SequenceEqual(copy2)}");
}
输出
答案 1 :(得分:0)
var sortedList = theLIST.OrderByDescending(s=>s);
答案 2 :(得分:0)
您应该尽可能地优化内部循环,因为大部分处理时间都花在了那里。我建议您自己实现字符串标记程序,因为您只需要第一个标记并且字符串非常统一。您可能要进行的第二个优化是将所有数字都乘以-1,因此以相反的顺序对列表进行排序很简单。像这样:
private static double getNumberFromString(String s){
int posFirstComma=0;
for (; posFirstComma<s.length() && s.charAt(posFirstComma)!=','; posFirstComma++);
return Convert.toDouble(s.subString(0, posFirstComma)*(-1);
}
myData.sort(new Comparision<String>((a,b)=> getNumberFromString(a)-getNumberFromString(b));
我个人不会接触库本身中的排序算法,因为它已经过全面优化。只需优化for循环中的所有内容即可。