来自List< []>的Excel数据输入很慢,有更好的算法设计吗?

时间:2016-02-12 00:33:55

标签: c# algorithm c#-4.0

我有一个算法,它会获取一个数组列表并将它们输入到Excel文件中,但速度非常慢。这个算法有更好的设计吗?

public void WriteToExcel(List<string[]> parsedData, string path, string fileName)
{
    // Get the Excel application object.
    Excel.Application xlApp = new Excel.Application();

    // Make Excel visible.
    xlApp.Visible = true;

    Excel.Workbook workbook = xlApp.Workbooks.Add(Excel.XlWBATemplate.xlWBATWorksheet);
    Excel.Worksheet sheet = (Excel.Worksheet)xlApp.Worksheets[1];
    sheet.Select(Type.Missing);
    //Loop through arrays in parsedData list.
    for (var lstElement=0;lstElement<parsedData.Count;lstElement++)
    {
        //Loop through array.
        for(var arryElement = 0; arryElement<parsedData[lstElement].Count(); arryElement++)
        {
            sheet.Cells[lstElement + 1, arryElement + 1] = parsedData[lstElement][arryElement];
        }
    }
    // Save the changes and close the workbook.
    workbook.Close(true, fileName, Type.Missing);

    // Close the Excel server.
    xlApp.Quit();

}

1 个答案:

答案 0 :(得分:3)

使用Office互操作时,最慢的部分是访问自动化类/接口的某些属性或方法时发生的进程间调用。

因此,优化目标应该是最小化往返(进程间调用)。

在您的特定用例中,不是逐个单元地设置值(即进行大量调用),幸运的是有一种方法可以通过传递值数组来设置整个Excel范围的值。根据包含数据的列数,以下修改应该会为您带来显着的加速。

重要部分:

//Loop through arrays in parsedData list.
int row = 1, column = 1;
object[] values = null; // buffer - see below. Avoids unnecessary allocations.
for (var lstElement = 0; lstElement < parsedData.Count; lstElement++)
{
    var data = parsedData[lstElement];
    if (data == null || data.Length == 0) continue;
    if (data.Length == 1)
    {
        // Single cell
        sheet.Cells[row, column] = data[0];
    }
    else
    {
        // Cell range
        var range = sheet.Range[CellName(row, column), CellName(row, column + data.Length - 1)];
        // We can pass the data array directly, but since it's a string[], Excel will treat them as text.
        // The trick is to to pass them via object[].
        if (values == null || values.Length != data.Length)
            values = new object[data.Length];
        for (int i = 0; i < data.Length; i++)
            values[i] = data[i];
        // Set all values in a single roundtrip
        range.Value2 = values;
    }
    row++;
}

使用的助手:

static string CellName(int row, int column)
{
    return ColumnName(column) + row;
}

static string ColumnName(int column)
{
    const int StartLetter = 'A', EndLetter = 'Z', LetterCount = EndLetter - StartLetter + 1;
    int index = column - 1;
    var letter = (char)(StartLetter + (index % LetterCount));
    if (index < LetterCount) return letter.ToString();
    var firstLetter = (char)(StartLetter + index / LetterCount - 1);
    return new string(new [] { firstLetter, letter });
}

一旦你明白了,你就可以通过扩展上面来处理这样的多行范围来获得更好的性能(在这种情况下最重要的是使用2d数组作为值):

const int MaxCells = 1 * 1024 * 1024; // Arbitrary
var maxColumns = parsedData.Max(data => data.Length);
var maxRows = Math.Min(parsedData.Count, MaxCells / maxColumns);
object[,] values = null;
int row = 1, column = 1;
for (int lstElement = 0; lstElement < parsedData.Count; )
{
    int rowCount = Math.Min(maxRows, parsedData.Count - lstElement);
    if (values == null || values.GetLength(0) != rowCount)
        values = new object[rowCount, maxColumns];
    for (int r = 0; r < rowCount; r++)
    {
        var data = parsedData[lstElement++];
        for (int c = 0; c < data.Length; c++)
            values[r, c] = data[c];
    }
    var range = sheet.Range[CellName(row, column), CellName(row + rowCount - 1, column + maxColumns - 1)];
    range.Value2 = values;
    row += rowCount;
}