我正在处理一个大型Excel(10k记录),并且要求此进程在多个线程上运行以提高性能。
现在我正在检查行< = 2000然后那就是运行所有记录的Utils.IxGenerateWithData。但如果行> 2000(例如10k)我想将它们拆分成多个线程来处理Utils.IxGenerateWithData,每个线程有2000条记录。
请帮忙
using (Stream contentStream = await requestContent.ReadAsStreamAsync())
{
Workbook workbook = new Workbook(contentStream);
Worksheet worksheet = workbook.Worksheets[0];
int column = 0; // first column
Cell lastCell = worksheet.Cells.EndCellInColumn((short)column);
//Run on multiple threads if the file has more than 2000 records
if (lastCell.Row > 2000)
{
//Not sure what to do here
// Infiniti GenerateWithData Web Service
Thread thread = new Thread(() => Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL));
thread.Start();
}
else
{
for (int row = 0; row <= lastCell.Row; row++)
{
Cell cell = worksheet.Cells.GetCell(row, column);
xmlContent += cell.StringValueWithoutFormat;
}
// Infiniti GenerateWithData Web Service
Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL);
}
}
答案 0 :(得分:0)
一个好的开始是确定你想要启动的线程数。如果你在每个线程上运行2000行,那么threadCount将按如下方式计算:
var threadCount = (lastCell.Row / 2000) + 1;
添加1是为了确保线程永远不会超过2000行,但可以减少。
然后按如下方式计算rowsPerThread:
var rowsPerThread = lastCell.Row / threadCount;
最后有一个for循环来启动线程传递它应该处理的行数组。在这里,我将创建一个在for循环中创建的类,并且需要处理的行在构造函数中传递。然后有一个Start方法,它启动一个线程来处理对象中的行。
这类课程的概要如下:
public class ExcelRowProcessor()
{
private List<ExcelRow> _rows = new List<ExcelRow>();
public ExcelRowProcessor(IEnumerable<ExcelRow> rows)
{
_rows.AddRange(rows);
}
public void Start()
{
// Start the thread here.
}
}
我希望这会有所帮助。
答案 1 :(得分:0)
很抱歉让这个问题成为新答案,但我还没有声誉可以发布Jaco's。
无论如何,通常您不希望根据工作负载/ bucketsize确定线程数。最好根据CPU核心数确定存储桶大小。这是为了防止线程切换,同时允许OS /病毒扫描程序的一个核心也可以提供帮助。
获取线程/核心/进程计数...请参阅此帖子:How to find the Number of CPU Cores via .NET/C#?
var threadCount = cpuCoreCount - 1; //TODO: use code from above URL
if (0 == threadCount) {
threadCount = 1;
}
var rowsPerThread = lastCell.Row / threadCount; // As Jaco posted
回到你关于如何线程的问题:
using (Stream contentStream = await requestContent.ReadAsStreamAsync())
{
Workbook workbook = new Workbook(contentStream);
Worksheet worksheet = workbook.Worksheets[0];
int column = 0; // first column
Cell lastCell = worksheet.Cells.EndCellInColumn((short)column);
List<IAsyncResult> asyncResults = new List<IAsyncResult>();
string xmlContent = ""; // assuming this is local
for (int row = 0; row <= lastCell.Row; row++)
{
Cell cell = worksheet.Cells.GetCell(row, column);
xmlContent += cell.StringValueWithoutFormat;
if (((row > 0) && (row % rowsPerThread == 0)) || (rows == lastCell.Row))
{
var caller = new GenerateDelegate(Generate);
asyncResults.Add(caller.BeginInvoke(xmlContent, null, null));
xmlContent = "";
}
}
// Wait for the threads
asyncResults.ForEach(result => {
while(result.IsCompleted == false) {
Thread.Sleep(250);
}
});
}
将此代码放在函数
之外private delegate void GenerateDelegate(string xmlContent);
///<summary>
/// Call Infiniti GenerateWithData Web Service
///<summary>
private void Generate(string xmlContent)
{
Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL);
}