使用LINQ,在谓词中使用Where()
的{{1}}方法是否有更快的替代方法,可以完全相同的结果?
以下是一个例子:
List<T>.Contains()
我找到的另一种方法是使用List<int> a = ...
List<int> b = ...
var result = a.Where(x => b.Contains(x)); //very slow
方法:
Intersect()
在var result = a.Intersect(b);
变量中,保留result
个值顺序。
但是,如果a
中的值包含重复项,则不会提供完全相同的结果,因为a
运算符仅返回不同的值。
另一种方式:
Intersect()
如果var result = a.Join(b, x => x, y => y, (x, y) => x);
包含重复项,结果又不一样。
还有其他可能吗?
我想避免的事情:
b
,并在HashSet
内使用Contains()
。答案 0 :(得分:3)
从语义上讲,你想要的是一个左内连接。 LINQ Join
运算符执行内连接,它很接近但不完全相同。幸运的是,您可以使用GroupJoin
执行左连接。
var query = from n in a
join k in b
on n equals k into matches
where matches.Any()
select n;
另一种选择是将第二个序列中的项目放入HashSet
,这可以比List
更有效地搜索。 (这类似于Join / GroupJoin将在内部执行的操作。)
var set = new HashSet<int>(b);
var query = a.Where(n => set.Contains(n));
另一种选择是像你一样使用Join
,但只是首先从b
删除所有重复项,因为如果没有重复项,那么它就会按照你想要的方式执行:
var result = a.Join(b.Distinct(), x => x, y => y, (x, y) => x);
答案 1 :(得分:0)
对于更快和重复,我会使用传统的“for”。
<强>被修改强>
我写了一个测试代码,考虑到:
IEnumerable<int>
LINQ的结果转换为更好的数据结构,如List<int>
。结果如下:
1 uses per result
Tigrou-Where : count= 93, 3.167,0ms
Tigrou-Intersect : count= 89, 116,0ms
Tigrou-Join : count= 96, 179,0ms
Servy-GroupJoin : count= 93, 262,0ms
Servy-HashSet : count= 93, 71,0ms
Servy-JoinDisctinct : count= 93, 212,0ms
JoseH-TheOldFor : count= 93, 72,0ms
2 uses per result
Tigrou-Where : count= 93, 6.007,0ms
Tigrou-Intersect : count= 89, 182,0ms
Tigrou-Join : count= 96, 293,0ms
Servy-GroupJoin : count= 93, 455,0ms
Servy-HashSet : count= 93, 99,0ms
Servy-JoinDisctinct : count= 93, 407,0ms
JoseH-TheOldFor : count= 93, 73,0ms
4 uses per result
Tigrou-Where : count= 93, 11.866,0ms
Tigrou-Intersect : count= 89, 353,0ms
Tigrou-Join : count= 96, 565,0ms
Servy-GroupJoin : count= 93, 899,0ms
Servy-HashSet : count= 93, 165,0ms
Servy-JoinDisctinct : count= 93, 786,0ms
JoseH-TheOldFor : count= 93, 73,0ms
8 uses per result
Tigrou-Where : count= 93, 23.831,0ms
Tigrou-Intersect : count= 89, 724,0ms
Tigrou-Join : count= 96, 1.151,0ms
Servy-GroupJoin : count= 93, 1.807,0ms
Servy-HashSet : count= 93, 299,0ms
Servy-JoinDisctinct : count= 93, 1.570,0ms
JoseH-TheOldFor : count= 93, 81,0ms
代码是:
class Program
{
static void Main(string[] args)
{
Random random = new Random(Environment.TickCount);
var cases = 1000;
List<int> a = new List<int>(cases);
List<int> b = new List<int>(cases);
for (int c = 0; c < cases; c++)
{
a.Add(random.Next(9999));
b.Add(random.Next(9999));
}
var times = 100;
var usesCount = 1;
Console.WriteLine("{0} times", times);
for (int u = 0; u < 4; u++)
{
Console.WriteLine();
Console.WriteLine("{0} uses per result", usesCount);
TestMethod(a, b, "Tigrou-Where", Where, times, usesCount);
TestMethod(a, b, "Tigrou-Intersect", Intersect, times, usesCount);
TestMethod(a, b, "Tigrou-Join", Join, times, usesCount);
TestMethod(a, b, "Servy-GroupJoin", GroupJoin, times, usesCount);
TestMethod(a, b, "Servy-HashSet", HashSet, times, usesCount);
TestMethod(a, b, "Servy-JoinDisctinct", JoinDistinct, times, usesCount);
TestMethod(a, b, "JoseH-TheOldFor", TheOldFor, times, usesCount);
usesCount *= 2;
}
Console.ReadLine();
}
private static void TestMethod(List<int> a, List<int> b, string name, Func<List<int>, List<int>, IEnumerable<int>> method, int times, int usesCount)
{
var stopwatch = new Stopwatch();
stopwatch.Start();
int count = 0;
for (int t = 0; t < times; t++)
{
// Process
var result = method(a, b);
// Count
for (int u = 0; u < usesCount; u++)
{
count = 0;
foreach (var item in result)
{
count++;
}
}
}
stopwatch.Stop();
Console.WriteLine("{0,-20}: count={1,4}, {2,8:N1}ms",
name, count, stopwatch.ElapsedMilliseconds);
}
private static IEnumerable<int> Where(List<int> a, List<int> b)
{
return a.Where(x => b.Contains(x));
}
private static IEnumerable<int> Intersect(List<int> a, List<int> b)
{
return a.Intersect(b);
}
private static IEnumerable<int> Join(List<int> a, List<int> b)
{
return a.Join(b, x => x, y => y, (x, y) => x);
}
private static IEnumerable<int> GroupJoin(List<int> a, List<int> b)
{
return from n in a
join k in b
on n equals k into matches
where matches.Any()
select n;
}
private static IEnumerable<int> HashSet(List<int> a, List<int> b)
{
var set = new HashSet<int>(b);
return a.Where(n => set.Contains(n));
}
private static IEnumerable<int> JoinDistinct(List<int> a, List<int> b)
{
return a.Join(b.Distinct(), x => x, y => y, (x, y) => x);
}
private static IEnumerable<int> TheOldFor(List<int> a, List<int> b)
{
var result = new List<int>();
int countA = a.Count;
var setB = new HashSet<int>(b);
for (int loopA = 0; loopA < countA; loopA++)
{
var itemA = a[loopA];
if (setB.Contains(itemA))
result.Add(itemA);
}
return result;
}
}
更改代码中的一行,以便在使用它之前将结果转换为List<int>
,并将其抛出8次:
8 uses per result
Tigrou-Where : count= 97, 2.974,0ms
Tigrou-Intersect : count= 91, 91,0ms
Tigrou-Join : count= 105, 150,0ms
Servy-GroupJoin : count= 97, 224,0ms
Servy-HashSet : count= 97, 74,0ms
Servy-JoinDisctinct : count= 97, 223,0ms
JoseH-TheOldFor : count= 97, 75,0ms
所以,我认为获胜者是:带有一点变体的Servy-HashSet方法:
var set = new HashSet<int>(b);
var result = a.Where(n => set.Contains(n)).ToList();