如何将excel行拆分为多个相等的批次并按每次迭代进行处理

时间:2016-12-07 05:24:03

标签: c# excel linq

我正在处理一个大型Excel(10k记录),并且要求此进程在多个线程上运行以提高性能。

现在我正在检查行< = 2000然后那就是运行所有记录的Utils.IxGenerateWithData。但如果行> 2000(例如10k)我想将它们拆分成多个线程来处理Utils.IxGenerateWithData,每个线程有2000条记录。

请帮忙

using (Stream contentStream = await requestContent.ReadAsStreamAsync())
                            {
                                Workbook workbook = new Workbook(contentStream);
                                Worksheet worksheet = workbook.Worksheets[0];
                                int column = 0; // first column
                                Cell lastCell = worksheet.Cells.EndCellInColumn((short)column);

                                //Run on multiple threads if the file has more than 2000 records
                                if (lastCell.Row > 2000)
                                {

                                    //Not sure what to do here



                                    // Infiniti GenerateWithData Web Service
                                    Thread thread = new Thread(() => Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL));
                                    thread.Start();
                                }
                                else
                                {
                                    for (int row = 0; row <= lastCell.Row; row++)
                                    {
                                        Cell cell = worksheet.Cells.GetCell(row, column);
                                        xmlContent += cell.StringValueWithoutFormat;
                                    }

                                    // Infiniti GenerateWithData Web Service
                                    Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL);
                                }                               
                            }

2 个答案:

答案 0 :(得分:0)

一个好的开始是确定你想要启动的线程数。如果你在每个线程上运行2000行,那么threadCount将按如下方式计算:

var threadCount = (lastCell.Row / 2000) + 1;

添加1是为了确保线程永远不会超过2000行,但可以减少。

然后按如下方式计算rowsPerThread:

var rowsPerThread = lastCell.Row / threadCount;

最后有一个for循环来启动线程传递它应该处理的行数组。在这里,我将创建一个在for循环中创建的类,并且需要处理的行在构造函数中传递。然后有一个Start方法,它启动一个线程来处理对象中的行。

这类课程的概要如下:

public class ExcelRowProcessor()
{
    private List<ExcelRow> _rows = new List<ExcelRow>();
    public ExcelRowProcessor(IEnumerable<ExcelRow> rows)
    {
        _rows.AddRange(rows);
    }

    public void Start()
    {
        // Start the thread here.
    }
}

我希望这会有所帮助。

答案 1 :(得分:0)

很抱歉让这个问题成为新答案,但我还没有声誉可以发布Jaco's。

无论如何,通常您不希望根据工作负载/ bucketsize确定线程数。最好根据CPU核心数确定存储桶大小。这是为了防止线程切换,同时允许OS /病毒扫描程序的一个核心也可以提供帮助。

获取线程/核心/进程计数...请参阅此帖子:How to find the Number of CPU Cores via .NET/C#?

var threadCount = cpuCoreCount - 1; //TODO: use code from above URL
if (0 == threadCount) {
   threadCount = 1;
}
var rowsPerThread = lastCell.Row / threadCount;  // As Jaco posted

回到你关于如何线程的问题:

using (Stream contentStream = await requestContent.ReadAsStreamAsync())
{
    Workbook workbook = new Workbook(contentStream);
    Worksheet worksheet = workbook.Worksheets[0];
    int column = 0; // first column
    Cell lastCell = worksheet.Cells.EndCellInColumn((short)column);
    List<IAsyncResult> asyncResults = new List<IAsyncResult>();
    string xmlContent = ""; // assuming this is local


    for (int row = 0; row <= lastCell.Row; row++)
    {
        Cell cell = worksheet.Cells.GetCell(row, column);
        xmlContent += cell.StringValueWithoutFormat;

        if (((row > 0) && (row % rowsPerThread == 0)) || (rows == lastCell.Row))
        {
            var caller = new GenerateDelegate(Generate);
            asyncResults.Add(caller.BeginInvoke(xmlContent, null, null));
            xmlContent = "";
        }
    }

    // Wait for the threads
    asyncResults.ForEach(result => {
       while(result.IsCompleted == false) {
           Thread.Sleep(250);
       }
   });
}

将此代码放在函数

之外
private delegate void GenerateDelegate(string xmlContent);

///<summary>
/// Call Infiniti GenerateWithData Web Service
///<summary>
private void Generate(string xmlContent) 
{
    Utils.IxGenerateWithData(payloadSettings.ProjectGUID, payloadSettings.DatasourceGUID, xmlContent, payloadSettings.InfinitiUsername, payloadSettings.InfinitiPassword, payloadSettings.ServiceWSDL);
}