我多次阅读了该主题:This SO Link关于比较两个XLS(Excel文件),我工作并尝试一些小示例。
我想编写一个性能最佳的C#代码,该代码读取两个巨大的XLS文件,并将文件A的第一行与文件B的所有行进行比较。如果文件B的所有行中均未出现文件A的第一行,请列出A,然后转到A.xls的下一行,然后再次与文件B的所有行进行比较。
更新1:
(我做如下操作):
DataTable dt1 = GetDataTableFromExcel(this.Directory, this.FirstFile, this.FirstFileSheetName);
dtRet = getDifferentRecords(dt1, dt2);
var adapter = new OleDbDataAdapter("SELECT * FROM [" + strSheetName + "$]", connectionString);
更新2:
我的主要问题发生在Xls包含4000条记录时! (大文件)
答案 0 :(得分:2)
与requested by OP一样,这是VBA解决方案。猜测一些细节,因此OP将需要进行调整以适合其特定的用例
这对我来说需要4000秒的记录,运行时间不到2秒
Sub Demo()
Dim wb1 As Workbook, wb2 As Workbook
Dim ws1 As Worksheet, ws2 As Worksheet
Dim r1 As Range, r2 As Range
Dim v1 As Variant, v2 As Variant
Dim rw1 As Long, rw2 As Long
Dim cl As Long
Dim Found As Boolean
Const NUM_COLS_COMPARE = 1 'adjust as required
' Get Reference to, or open workboks
Set wb1 = Application.Workbooks("NameOfBook1.xlsx") 'if already open
Set wb2 = Application.Workbooks.Open("C:\Path\ToWorkbook2.xlsx") 'if not open
'Get reference to sheets
Set ws1 = wb1.Worksheets("NameOfSheet1")
Set ws2 = wb2.Worksheets("NameOfSheet2")
'get reference to ranges
' assuming data in Column A and Row 1fill whole range. Adjust if necassary
Set r1 = ws1.Range(ws1.Cells(1, ws1.Columns.Count).End(xlToLeft), _
ws1.Cells(ws1.Rows.Count, 1).End(xlUp))
Set r2 = ws2.Range(ws2.Cells(1, ws2.Columns.Count).End(xlToLeft), _
ws2.Cells(ws2.Rows.Count, 1).End(xlUp))
'Get Data into Array
v1 = r1.Value2
v2 = r2.Value2
For rw1 = 1 To UBound(v1, 1)
For rw2 = 1 To UBound(v2, 1)
Found = False
For cl = 1 To NUM_COLS_COMPARE
If v1(rw1, cl) = v2(rw2, cl) Then
Found = True
Exit For
End If
Next
If Found Then Exit For
Next rw2
'List Found row
If Not Found Then
Debug.Print "No Match for " & rw1, v1(rw1, 1)
End If
Next rw1
End Sub