导入大型.xlsx文件非常慢

时间:2017-10-06 15:50:04

标签: c# wpf excel datagrid wpfdatagrid

我是c#和WPF的新手,并尝试将大型.xlsx文件导入数据网格,我可以拥有大约200多个列&超过100,000行。用我目前的方法花了一个多小时(我没有让它完成)。我在csv术语中的格式示例是:

"Time","Dist","V_Front","V_Rear","RPM"
"s","m","km/h","km/h","rpm"
"0.000","0","30.3","30.0","11995"
"0.005","0","30.3","30.0","11965"
"0.010","0","30.3","31.0","11962"

我目前正在使用Interop,但我想知道是否有另一种方法可以大幅减少加载时间。我希望使用SciCharts(他们有学生执照)来绘制这些数据,并带有用于选择频道的复选框,但这是另一回事。

.CS

    private void Button_Click(object sender, RoutedEventArgs e)
    {
        OpenFileDialog openfile = new OpenFileDialog();
        openfile.DefaultExt = ".xlsx";
        openfile.Filter = "(.xlsx)|*.xlsx";

        var browsefile = openfile.ShowDialog();

        if (browsefile == true)
        {
            txtFilePath.Text = openfile.FileName;

            Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();
            Microsoft.Office.Interop.Excel.Workbook excelBook = excelApp.Workbooks.Open(txtFilePath.Text.ToString(), 0, true, 5, "", "", true, Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
            Microsoft.Office.Interop.Excel.Worksheet excelSheet = (Microsoft.Office.Interop.Excel.Worksheet)excelBook.Worksheets.get_Item(1); ;
            Microsoft.Office.Interop.Excel.Range excelRange = excelSheet.UsedRange;

            string strCellData = "";
            double douCellData;
            int rowCnt = 0;
            int colCnt = 0;

            DataTable dt = new DataTable();
            for (colCnt = 1; colCnt <= excelRange.Columns.Count; colCnt++)
            {
                string strColumn = "";
                strColumn = (string)(excelRange.Cells[1, colCnt] as Microsoft.Office.Interop.Excel.Range).Value2;
                dt.Columns.Add(strColumn, typeof(string));
            }

            for (rowCnt = 2; rowCnt <= excelRange.Rows.Count; rowCnt++)
            {
                string strData = "";
                for (colCnt = 1; colCnt <= excelRange.Columns.Count; colCnt++)
                {
                    try
                    {
                        strCellData = (string)(excelRange.Cells[rowCnt, colCnt] as Microsoft.Office.Interop.Excel.Range).Value2;
                        strData += strCellData + "|";
                    }
                    catch (Exception ex)
                    {
                        douCellData = (excelRange.Cells[rowCnt, colCnt] as Microsoft.Office.Interop.Excel.Range).Value2;
                        strData += douCellData.ToString() + "|";
                    }
                }
                strData = strData.Remove(strData.Length - 1, 1);
                dt.Rows.Add(strData.Split('|'));
            }

            dtGrid.ItemsSource = dt.DefaultView;

            excelBook.Close(true, null, null);
            excelApp.Quit();


        }
    }

我非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

问题是,有太多的单独读取导致Excel和您的应用程序之间的大量反射使用和编组。如果您不关心内存使用情况,可以将整个Range读入内存并从内存开始工作,而不是单独读取单元格。以下代码在包含5列和103938行的测试文件上以3880毫秒运行:

OpenFileDialog openfile = new OpenFileDialog();
openfile.DefaultExt = ".xlsx";
openfile.Filter = "(.xlsx)|*.xlsx";

var browsefile = openfile.ShowDialog();

if (browsefile == true)
{
    txtFilePath.Text = openfile.FileName;

    var excelApp = new Microsoft.Office.Interop.Excel.Application();
    var excelBook = excelApp.Workbooks.Open(txtFilePath.Text, 0, true, 5, "", "", true,
        Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
    var excelSheet = (Microsoft.Office.Interop.Excel.Worksheet) excelBook.Worksheets.Item[1];

    Microsoft.Office.Interop.Excel.Range excelRange = excelSheet.UsedRange;

    DataTable dt = new DataTable();

    object[,] value = excelRange.Value;

    int columnsCount = value.GetLength(1);
    for (var colCnt = 1; colCnt <= columnsCount; colCnt++)
    {
        dt.Columns.Add((string)value[1, colCnt], typeof(string));
    }

    int rowsCount = value.GetLength(0);
    for (var rowCnt = 2; rowCnt <= rowsCount; rowCnt++)
    {
        var dataRow = dt.NewRow();
        for (var colCnt = 1; colCnt <= columnsCount; colCnt++)
        {
            dataRow[colCnt - 1] = value[rowCnt, colCnt];
        }
        dt.Rows.Add(dataRow);
    }

    dtGrid.ItemsSource = dt.DefaultView;

    excelBook.Close(true);
    excelApp.Quit();
}

如果您不想阅读整个Range,那么您应该以合理的批次进行阅读。

另一个优化是在后台线程上运行它,因此它不会在加载时阻止UI。

修改

为了在后台线程上运行它,您可以将按钮单击处理程序修改为异步方法,并将解析逻辑放入另一个方法,该方法在Task.Run的线程池线程上运行实际解析:

private async void Button_Click(object sender, RoutedEventArgs e)
{
    OpenFileDialog openfile = new OpenFileDialog();
    openfile.DefaultExt = ".xlsx";
    openfile.Filter = "(.xlsx)|*.xlsx";

    var browsefile = openfile.ShowDialog();

    if (browsefile == true)
    {
        txtFilePath.Text = openfile.FileName;

        DataTable dataTable = await ParseExcel(txtFilePath.Text).ConfigureAwait(true);

        dtGrid.ItemsSource = dataTable.DefaultView;
    }
}

private Task<DataTable> ParseExcel(string filePath)
{
    return Task.Run(() =>
    {
        var excelApp = new Microsoft.Office.Interop.Excel.Application();
        var excelBook = excelApp.Workbooks.Open(filePath, 0, true, 5, "", "", true,
            Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
        var excelSheet = (Microsoft.Office.Interop.Excel.Worksheet) excelBook.Worksheets.Item[1];

        Microsoft.Office.Interop.Excel.Range excelRange = excelSheet.UsedRange;

        DataTable dt = new DataTable();

        object[,] value = excelRange.Value;

        int columnsCount = value.GetLength(1);
        for (var colCnt = 1; colCnt <= columnsCount; colCnt++)
        {
            dt.Columns.Add((string) value[1, colCnt], typeof(string));
        }

        int rowsCount = value.GetLength(0);
        for (var rowCnt = 2; rowCnt <= rowsCount; rowCnt++)
        {
            var dataRow = dt.NewRow();
            for (var colCnt = 1; colCnt <= columnsCount; colCnt++)
            {
                dataRow[colCnt - 1] = value[rowCnt, colCnt];
            }
            dt.Rows.Add(dataRow);
        }

        excelBook.Close(true);
        excelApp.Quit();

        return dt;
    });
}

处理程序只调用解析函数,解析函数在后台线程上运行,当它完成时,处理程序可以继续将结果DataTable分配给ItemsSource