我有一张带有两个标签的Excel表格。
一个约为700 k,另一个约为25k。问题是,当我加载文件时,我的内存被吃掉了,它崩溃了!如何处理大文件,有些甚至可能超过一百万行。
这是我目前正在使用的代码:
OleDbConnection cnn = new OleDbConnection("provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties=Excel 12.0;");
cnn.Open();
string qry = "SELECT * FROM [Detail$]";
OleDbDataAdapter odp = new OleDbDataAdapter(qry, cnn);
odp.Fill(detailTable);
DataSet tmp = new DataSet();
if (detailTable.Rows.Count > 0)
{
Console.WriteLine("Total " + detailTable.Rows.Count + " Detail rows Loaded");
// MessageBox.Show("Input Sheet UPLOADED !");
}
qry = "SELECT * FROM [Gallery$]";
OleDbDataAdapter odp1 = new OleDbDataAdapter(qry, cnn);
odp1.Fill(galleryTable);
if (galleryTable.Rows.Count > 0)
{
Console.WriteLine("Total " + galleryTable.Rows.Count + " Gallery Numbers Loaded");
// MessageBox.Show("Input Sheet UPLOADED !");
}
答案 0 :(得分:3)
好的,我可以建议您使用DbDataAdapter.Fill(Int32, Int32, DataTable[])
类的DbDataAdapter
重载方法以“块”模式工作:
public int Fill(
int startRecord,
int maxRecords,
params DataTable[] dataTables
)
使用此方法和我的代码示例,您可以一次使用大量行来完成工作,而不是使用内存中的完整Excel数据。每次填充后,处理临时数据表对象,这样就可以避免内存泄漏。
以下是如何做到这一点:
const string fileName = "myData.xlsx";
const string excelConnString = "provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + fileName + "';Extended Properties=Excel 12.0;";
using (var cnn = new OleDbConnection(excelConnString))
{
cnn.Open();
const string countQuery = "SELECT COUNT(*) FROM [Detail$]";
using (var cmd = new OleDbCommand(countQuery, cnn))
{
using (var reader = cmd.ExecuteReader())
{
if (reader == null) return;
reader.Read();
var rowsCount = ((int)reader[0]);
const string query = "SELECT * FROM [Detail$]";
using (var odp = new OleDbDataAdapter(query, cnn))
{
var detailTable = new DataTable();
var recordToStartFetchFrom = 0; //zero-based record number to start with.
const int chunkSize = 100;
while (recordToStartFetchFrom <= rowsCount)
{
var diff = rowsCount - recordToStartFetchFrom;
int internalChunkSize = diff < 100 ? diff : chunkSize;
odp.Fill(recordToStartFetchFrom, internalChunkSize, detailTable);
foreach (DataRow row in detailTable.Rows)
{
Console.WriteLine("{1} {0}", row.ItemArray[0], row.ItemArray[1]);
}
Console.WriteLine("--------- {0}-{1} Rows Processed ---------", recordToStartFetchFrom, recordToStartFetchFrom + internalChunkSize);
recordToStartFetchFrom += chunkSize;
detailTable.Dispose();
detailTable = null;
detailTable = new DataTable();
}
}
Console.ReadLine();
}
}
}
答案 1 :(得分:1)
由于你需要在内存中加载大量数据(即1M行*每行1Kb~1GB),你唯一合理的选择是使用build 64bit(x64)应用程序作为x86应用程序的地址空间限制(普通x86上为2GB)系统,在x64系统上高达4GB)将不允许分配足够的内存。
注意:
答案 2 :(得分:1)
仅选择您需要的5或6列。