使用C#

时间:2015-09-21 09:19:40

标签: c# excel excel-interop

我正在尝试找到一种更快的方法来读取可以在Excel 2010中打开的XML文件。我无法使用readxml方法立即读取XML文件,因为它包含工作簿,样式,单元格,数据和其他标记。所以我的方法是在Excel中打开它,然后只在表2中获取数据。示例文件包含9,000多行,大约需要2分钟49secs存储在数据表中。实际文件有25,000多行。这就是我的尝试:

private void bulkInsert()
        {
            var s = new Stopwatch();

            s.Start();
            try
            {
                KillExcel();
                GCollector();
                Excel.Application app = null;
                app = new Excel.Application();
                Excel.Worksheet sheet = null;
                Excel.Workbook book = null;

                book = app.Workbooks.Open(@"my directory for the file");

                sheet = (Worksheet)book.Sheets[2];
                sheet.Select(Type.Missing);

                var xlRange = (Excel.Range)sheet.Cells[sheet.Rows.Count, 1];
                int lastRow = (int)xlRange.get_End(Excel.XlDirection.xlUp).Row;
                int newRow = lastRow + 1;
                var cellrow = newRow;
                int columns = sheet.UsedRange.Columns.Count;
                Excel.Range test = sheet.UsedRange;
                System.Data.DataTable dt = new System.Data.DataTable();
                dt.Columns.Add("Node_SegmentName");
                dt.Columns.Add("Type");
                dt.Columns.Add("Sub-Type");
                dt.Columns.Add("Description");
                dt.Columns.Add("Parameter_DataIdentifier");
                dt.Columns.Add("RuntimeValue");
                dt.Columns.Add("Category");
                dt.Columns.Add("Result");
                dt.TableName = "SsmXmlTable";
//slow part
                for (i = 0; i < lastRow; i++)
                {
                    DataRow excelRow = dt.NewRow();

                    for (int j = 0; j < columns; j++)
                    {
                        excelRow[j] = test.Cells[i + 2, j + 1].Value2;

                    }
                    dt.Rows.Add(excelRow);

                }
                dataGridView1.DataSource = dt;

                DataSet ds = new DataSet();
                ds.Tables.Add(dt);
                ds.WriteXml(AppDomain.CurrentDomain.BaseDirectory + String.Format("\\XMLParserOutput{0}.xml", DateTime.Now.ToString("MM-d-yyyy")));



                DataSet reportData = new DataSet();
                reportData.ReadXml(AppDomain.CurrentDomain.BaseDirectory + String.Format("\\XMLParserOutput{0}.xml", DateTime.Now.ToString("MM-d-yyyy")));
                SqlConnection connection = new SqlConnection("Data Source=YOURCOMPUTERNAME\\SQLEXPRESS;Initial Catalog=YOURDATABASE;Integrated Security=True;Connect Timeout=0");
                connection.Open();
                SqlBulkCopy sbc = new SqlBulkCopy(connection);
                sbc.DestinationTableName = "Test";
                sbc.WriteToServer(reportData.Tables["SsmXmlTable"]);
                connection.Close();
                s.Stop();
                var duration = s.Elapsed;

                MessageBox.Show(duration.ToString() + " bulk insert way");
                MessageBox.Show(ds.Tables["SsmXmlTable"].Rows.Count.ToString());//439 rows
            }
            catch (Exception ex)
            {

                KillExcel();
                GCollector();
                MessageBox.Show(ex.ToString() + i.ToString());
            }
        }

如果没有Excel部分的读取,使用批量复制插入数据只需要几秒钟(449行0.5秒)。

1 个答案:

答案 0 :(得分:0)

对于遇到同样问题的其他人,我所做的是:

  • 将xml保存为xlsx文件
  • 使用oledb读取xlsx文件
  • 使用OleDbAdapter(Fill()方法)
  • 存储在数据集中
  • 批量插入

以下是我用来执行此操作的代码(更改连接字符串):

Stopwatch s = new Stopwatch();
s.Start();
string sSheetName = null;
string sConnection = null;
System.Data.DataTable sheetData = new System.Data.DataTable();
System.Data.DataTable dtTablesList = default(System.Data.DataTable);
OleDbConnection oleExcelConnection = default(OleDbConnection);
sConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + @"C:\Users\YOURUSERNAME\Documents\Visual Studio 2012\Projects\TestXmlParser\TestXmlParser\bin\Debug\ConsolidatedSSMFiles.xlsx" + ";Extended Properties=\"Excel 8.0;HDR=Yes;IMEX=1\"";
oleExcelConnection = new OleDbConnection(sConnection);
oleExcelConnection.Open();
dtTablesList = oleExcelConnection.GetSchema("Tables");

if (dtTablesList.Rows.Count > 0)
{
sSheetName = dtTablesList.Rows[0]["TABLE_NAME"].ToString();
}
dtTablesList.Clear();
dtTablesList.Dispose();

if (!string.IsNullOrEmpty(sSheetName))
{
OleDbDataAdapter sheetAdapter = new OleDbDataAdapter("select * from [TEST$]", oleExcelConnection);
sheetAdapter.Fill(sheetData);
} s.Stop();
var duration = s.Elapsed;


oleExcelConnection.Close();
dataGridView1.DataSource = sheetData;
MessageBox.Show(sheetData.Rows.Count.ToString()+"rows - "+ duration.ToString());

这将读取25000多行excel数据到大约数据。 1.9至2.0秒。