除了使用IDataStreamer
和IBinaryObject
减少Apache Ignite.NET的插入时间之外,我还能做些什么?可以获得显着的性能提升,还是可以达到最佳效果?
我正在使用:
IBinaryObject
/ WithKeepBinary
IDataStreamer
这是我对IDataStreamer
的使用:
using (var ds = m_ignite.GetDataStreamer<string, IBinaryObject>(CacheName)) {
foreach (var binaryRow in rows.Select(r => BuildRow(r))) {
var key = binaryRow.GetField<string>(PrimaryKeyName);
ds.AddData(key, binaryRow);
}
}
性能结果:(5个节点都具有相同的规格)
BenchmarkDotNet=v0.10.8, OS=Windows 8.1 (6.3.9600)
Processor=Intel Xeon CPU E5-2698 v4 2.20GHz Intel Xeon CPU E5-2698 v4 2.20GHz, ProcessorCount=4
Frequency=14318180 Hz, Resolution=69.8413 ns, Timer=HPET
[Host] : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
Job-UZDKMF : Clr 4.0.30319.42000, 64bit RyuJIT-v4.7.2053.0
RunStrategy=Monitoring TargetCount=1
NumRows Mean (ms) Per Row (ms/row)
10 359.50* 35.95*
100 465.50* 4.66*
1,000 797.80* 0.80*
10,000 4,479.80 0.45
100,000 37,611.60 0.38
500,000 184,640.00 0.37
1,000,000 366,801.40 0.37
2,000,000 732,562.40 0.37
4,000,000 1,458,913.60 0.36
*Measurement is larger because it also measures some lightweight work before inserting the rows
我们非常感谢您提供任何提示,技巧或文档。谢谢!
答案 0 :(得分:2)
不要调用GetField来检索密钥,直接从BuildRow返回(即返回KeyValuePair<string, IBinaryObject>
)
并行插入(和BuildRow
调用):
Parallel.ForEach(rows, r =>
{
KeyValuePair<string, IBinaryObject> pair = BuildRow(r);
ds.AddData(pair);
});
在更多计算机上运行更多Ignite节点
如果行来自外部数据源,则可以使每个Ignite节点仅加载相关部分。您可以通过ICompute.Broadcast
在每一行上执行DataStreamer来执行此操作,并在迭代行时检查该键是否属于当前节点:
IAffinity aff = m_ignite.GetAffinity(cacheName);
IClusterNode localNode = m_ignite.GetCluster().GetLocalNode();
Parallel.ForEach(rows, r =>
{
string key = GetKey(r);
if (aff.IsPrimary(localNode, key))
{
KeyValuePair<string, IBinaryObject> pair = BuildRow(r);
ds.AddData(pair);
}
});