前言:我只是问这个,因为我没有一个环境(数据集足够大+计算能力)来以可靠的方式测试它。
问题:给定Concurrent Bag,加载了数十亿个项目,由单个线程访问/使用,它是否与List类似?换句话说,Concurrent Bag
的枚举是否比List<T>
更多或更少的表现?
答案 0 :(得分:4)
ConcurrentBag<T>
不可避免地会比List<T>
更低效。虽然您只能从一个线程访问它,但该结构仍需要有机制来防止在出现并发访问时发生种族危险的可能性。
如果您将在开始枚举之前从单个线程加载集合,则可以通过使用ConcurrentBag(IEnumerable<T>)
构造函数来避免性能开销,而不是通过其单独添加每个项目Add
方法。
ConcurrentBag<T>
为枚举提供“时刻快照”语义;请参阅其GetEnumerator
方法的备注。当您从ConcurrentBag<T>
循环访问foreach
时,它会首先将其全部内容复制到普通List<T>
,然后对其进行枚举。每次在循环中使用它时,都会产生大量的性能开销(包括计算和内存)。
如果您的方案是您的列表将由多个线程填充,但只有一个线程读取,那么您应该在编写器完成后立即将其转换为List<T>
。
答案 1 :(得分:3)
数十亿件商品和List or Concurrent bag?那是“不行”。
就性能而言,试试这个测试添加:(随意修改它以测试其他操作)
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConcurrentBagTest
{
// You must compile this for x64 or you will get OutOfMemory exception
class Program
{
static void Main(string[] args)
{
ListTest(10000000);
ListTest(100000000);
ListTest(1000000000);
ConcurrentBagTest(10000000);
ConcurrentBagTest(100000000);
Console.ReadKey();
}
static void ConcurrentBagTest(long count)
{
try
{
var bag = new ConcurrentBag<long>();
Console.WriteLine($"--- ConcurrentBagTest count = {count}");
Console.WriteLine($"I will use {(count * sizeof(long)) / Math.Pow(1024, 2)} MiB of RAM");
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (long i = 0; i < count; i++)
{
bag.Add(i);
}
stopwatch.Stop();
Console.WriteLine($"Inserted {bag.LongCount()} items in {stopwatch.Elapsed.TotalSeconds} s");
Console.WriteLine();
Console.WriteLine();
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
GC.Collect();
GC.WaitForPendingFinalizers();
}
static void ListTest(long count)
{
try
{
var list = new List<long>();
Console.WriteLine($"--- ListTest count = {count}");
Console.WriteLine($"I will use {(count * sizeof(long)) / Math.Pow(1024, 2)} MiB of RAM");
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
for (long i = 0; i < count; i++)
{
list.Add(i);
}
stopwatch.Stop();
Console.WriteLine($"Inserted {list.LongCount()} items in {stopwatch.Elapsed.TotalSeconds} s");
Console.WriteLine();
Console.WriteLine();
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
}
我的输出:
--- ListTest count = 10000000
I will use 76,2939453125 MiB of RAM
Inserted 10000000 items in 0,0807315 s
--- ListTest count = 100000000
I will use 762,939453125 MiB of RAM
Inserted 100000000 items in 0,7741546 s
--- ListTest count = 1000000000
I will use 7629,39453125 MiB of RAM
System.OutOfMemoryException: Array dimensions exceeded supported range.
--- ConcurrentBagTest count = 10000000
I will use 76,2939453125 MiB of RAM
Inserted 10000000 items in 1,0744069 s
--- ConcurrentBagTest count = 100000000
I will use 762,939453125 MiB of RAM
Inserted 100000000 items in 11,3976436 s
使用CPU:Intel Core i7-2600 @ 3.4 GHz,
使用RAM:16 GB
另请参阅this answer了解限制。
答案 2 :(得分:0)
但是,如果您需要删除项目,则ConcurrentBag的速度明显快于List
void Main()
{
ConcurrentBag<int> bag = new ConcurrentBag<int>();
ConcurrentStack<int> stack = new ConcurrentStack<int>();
ConcurrentQueue<int> q = new ConcurrentQueue<int>();
List<int> list = new List<int>();
Stopwatch sw = new Stopwatch();
int count = 100000;
sw.Start();
for (int i = 0; i < count; i++)
{
bag.Add(i);
}
for (int i = 0; i< count; i++)
{
bag.TryTake(out _);
}
sw.Elapsed.Dump("BAG");
sw.Start();
for (int i = 0; i < count; i++)
{
stack.Push(i);
}
for (int i = 0; i < count; i++)
{
stack.TryPop(out _);
}
sw.Elapsed.Dump("Stack");
sw.Start();
for (int i = 0; i < count; i++)
{
q.Enqueue(i);
}
for (int i = 0; i < count; i++)
{
q.TryDequeue(out _);
}
sw.Elapsed.Dump("Q");
sw.Start();
for (int i = 0; i < count; i++)
{
list.Add(i);
}
for (int i = 0; i < count; i++)
{
list.RemoveAt(0);
}
sw.Elapsed.Dump("list remove at 0");
sw.Start();
for (int i = 0; i < count; i++)
{
list.Add(i);
}
for (int i = 0; i < count; i++)
{
list.RemoveAt(list.Count -1);
}
sw.Elapsed.Dump("list remove at end");
}
结果:
BAG 00:00:00.0144421
堆栈 00:00:00.0341379
Q 00:00:00.0400114
列表从0删除 00:00:00.6188329
列表最后删除 00:00:00.6202170