upd 我现在认为我的问题的根源不是“线程化”,因为我在程序的任何一点都观察到减速。我想在某种程度上使用2个处理器时,我的程序执行速度较慢可能是因为两个处理器需要在彼此之间“通信”。我需要做一些测试。我将尝试禁用其中一个处理器,看看会发生什么。
====================================
我不确定这是否是C#问题,可能更多关于硬件,但我认为C#最合适。
我使用便宜的DL120服务器,我决定升级到更昂贵的2处理器DL360p服务器。出乎意料的是,我的C#程序在新服务器上的工作速度大约慢了2倍,这应该要快几倍。
我处理了约60台仪器的FAST数据。我为每个乐器创建了单独的任务:
BlockingCollection<OrderUpdate> updatesQuery;
if (instrument2OrderUpdates.ContainsKey(instrument))
{
updatesQuery = instrument2OrderUpdates[instrument];
} else
{
updatesQuery = new BlockingCollection<OrderUpdate>();
instrument2OrderUpdates[instrument] = updatesQuery;
ScheduleFastOrdersProcessing(updatesQuery);
}
orderUpdate.Checkpoint("updatesQuery.Add");
updatesQuery.Add(orderUpdate);
}
private void ScheduleFastOrdersProcessing(BlockingCollection<OrderUpdate> updatesQuery)
{
Task.Factory.StartNew(() =>
{
Instrument instrument = null;
OrderBook orderBook = null;
int lastRptSeqNum = -1;
while (!updatesQuery.IsCompleted)
{
OrderUpdate orderUpdate;
try
{
orderUpdate = updatesQuery.Take();
} catch(InvalidOperationException e)
{
Log.Push(LogItemType.Error, e.Message);
continue;
}
orderUpdate.Checkpoint("received from updatesQuery.Take()");
......................
...................... // long not interesting processing code
}, TaskCreationOptions.LongRunning);
因为我有大约60个可以并行执行的任务,所以我希望2 * E5-2640(24个虚拟线程,12个真实线程)的执行速度要比1 * E3-1220(4个真实线程)快得多。似乎使用DL360p我在任务管理器中找到了95个线程。使用DL120我只有55个线程。
但DL120G7的执行时间快2倍(!!)! E3-1220的时钟频率比E5-2640(3.1 GHz vs 2.5Ghz)好一点但是我仍然希望我的代码在2 * E5-2640上运行得更快,因为它可以更好地并行,我绝对不会期望它的工作速度慢了2倍!
任务管理器中最多50个线程= 24个平均值~80微秒
calling market.UpdateFastOrder = 23 updatesQuery.Add = 25 received from updatesQuery.Take() = 67 in orderbook = 80
calling market.UpdateFastOrder = 30 updatesQuery.Add = 32 received from updatesQuery.Take() = 64 in orderbook = 73
calling market.UpdateFastOrder = 31 updatesQuery.Add = 32 received from updatesQuery.Take() = 195 in orderbook = 204
calling market.UpdateFastOrder = 31 updatesQuery.Add = 32 received from updatesQuery.Take() = 74 in orderbook = 86
calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 78
calling market.UpdateFastOrder = 29 updatesQuery.Add = 32 received from updatesQuery.Take() = 76 in orderbook = 88
calling market.UpdateFastOrder = 30 updatesQuery.Add = 32 received from updatesQuery.Take() = 80 in orderbook = 92
calling market.UpdateFastOrder = 20 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 78
calling market.UpdateFastOrder = 21 updatesQuery.Add = 24 received from updatesQuery.Take() = 68 in orderbook = 81
calling market.UpdateFastOrder = 12 updatesQuery.Add = 13 received from updatesQuery.Take() = 58 in orderbook = 72
calling market.UpdateFastOrder = 22 updatesQuery.Add = 23 received from updatesQuery.Take() = 51 in orderbook = 59
calling market.UpdateFastOrder = 16 updatesQuery.Add = 16 received from updatesQuery.Take() = 20 in orderbook = 24
calling market.UpdateFastOrder = 28 updatesQuery.Add = 31 received from updatesQuery.Take() = 82 in orderbook = 94
calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 65 in orderbook = 77
calling market.UpdateFastOrder = 29 updatesQuery.Add = 29 received from updatesQuery.Take() = 259 in orderbook = 264
calling market.UpdateFastOrder = 49 updatesQuery.Add = 52 received from updatesQuery.Take() = 99 in orderbook = 113
calling market.UpdateFastOrder = 22 updatesQuery.Add = 23 received from updatesQuery.Take() = 50 in orderbook = 60
calling market.UpdateFastOrder = 29 updatesQuery.Add = 32 received from updatesQuery.Take() = 76 in orderbook = 88
calling market.UpdateFastOrder = 16 updatesQuery.Add = 19 received from updatesQuery.Take() = 63 in orderbook = 75
calling market.UpdateFastOrder = 27 updatesQuery.Add = 27 received from updatesQuery.Take() = 226 in orderbook = 231
calling market.UpdateFastOrder = 15 updatesQuery.Add = 16 received from updatesQuery.Take() = 35 in orderbook = 42
calling market.UpdateFastOrder = 18 updatesQuery.Add = 21 received from updatesQuery.Take() = 66 in orderbook = 78
任务管理器中有<〜> 95个线程;最佳= 40平均~150微秒
calling market.UpdateFastOrder = 62 updatesQuery.Add = 64 received from updatesQuery.Take() = 144 in orderbook = 205
calling market.UpdateFastOrder = 27 updatesQuery.Add = 32 received from updatesQuery.Take() = 101 in orderbook = 154
calling market.UpdateFastOrder = 45 updatesQuery.Add = 50 received from updatesQuery.Take() = 124 in orderbook = 187
calling market.UpdateFastOrder = 46 updatesQuery.Add = 51 received from updatesQuery.Take() = 127 in orderbook = 162
calling market.UpdateFastOrder = 63 updatesQuery.Add = 68 received from updatesQuery.Take() = 137 in orderbook = 174
calling market.UpdateFastOrder = 53 updatesQuery.Add = 55 received from updatesQuery.Take() = 133 in orderbook = 171
calling market.UpdateFastOrder = 44 updatesQuery.Add = 46 received from updatesQuery.Take() = 131 in orderbook = 158
calling market.UpdateFastOrder = 37 updatesQuery.Add = 39 received from updatesQuery.Take() = 102 in orderbook = 140
calling market.UpdateFastOrder = 45 updatesQuery.Add = 50 received from updatesQuery.Take() = 115 in orderbook = 154
calling market.UpdateFastOrder = 50 updatesQuery.Add = 55 received from updatesQuery.Take() = 133 in orderbook = 160
calling market.UpdateFastOrder = 26 updatesQuery.Add = 50 received from updatesQuery.Take() = 99 in orderbook = 111
calling market.UpdateFastOrder = 14 updatesQuery.Add = 30 received from updatesQuery.Take() = 36 in orderbook = 40 <-- best one I can find among thousands
你是否能够看到为什么我的程序在服务器速度提高几倍的情况下运行速度慢2倍?可能我不应该创建~60任务?可能我应该告诉.NET不要使用95个线程但是将其限制为50或甚至24?可能这是2处理器与1处理器配置问题?可能只是禁用我的DL360P Gen8上的一个处理器会显着加速程序?
加
答案 0 :(得分:0)
只是因为你有一个可以处理更多线程的系统,这并不意味着所有这些线程都可以完全并行处理。
当我从Quadcore CPU升级到i7(虚拟8核)时,我注意到使用比核心更多线程的设置导致线程相互阻塞一段时间,这导致系统整体减速。
问题只是我的algorythims已经能够使用他们的线程正在运行的核心的完整处理时间,而等待线程只能工作大约5到10%,这导致主线程完成但有些烧焦线程仍然需要完成所有工作(再次花费相同的时间)。
线程池只有在所有工作人员完成后才会继续,因此完成之前的总时间将是其他线程的未处理处理器时间。
也许您只需要找到最佳线程数。