比较两个文件并输出不同的

时间:2017-09-18 16:29:22

标签: excel-vba vba excel

我每天都会收到一份报告,它上面有大约8500行,我一直试图制作一个可以完成手动操作的宏。报告的问题是所有行都不是相同的格式(即第1行:数字,文本,文本,数字和第2行:文本,数字,数字,文本)。

我想将新文件与旧文件进行比较并输出新的差异。我可以让宏来运行2个文件,但它没有将任何行标记为不同但我知道它们是。

Sub test()

Dim yesterdayFile As String
Dim todayFile As String

yesterdayFile = Application.GetOpenFilename()
todayFile = Application.GetOpenFilename()
Dim yesterdayLine As String
Dim todayLine As String
Dim txt As String
Dim i, j, k, sameLine As Integer
Dim wkbTemp As Workbook
i = 1
j = 1
k = 1
sameLine = 0


Open yesterdayFile For Input As #1
Do Until EOF(1)
    sameLine = 1 'reset write operator
    Open todayFile For Input As #2
Line Input #1, yesterdayLine
    Do Until EOF(2)
        Line Input #2, todayLine
        If StrComp(yesterdayLine, todayLine) = 0 Then 'compare lines in files if same then flag write operator to 1
           sameLine = 1
        End If
        j = j + 1 'inner loop counter
    Loop
If sameLine = 0 Then 'if write operator is not active then output line
    Cells(i, 1) = yesterdayLine
    i = i + 1 'counter for cells
End If
Close #2
k = k + 1 'outer loop counter
Loop

'test line to see if its eof
Cells(1, 10) = i
Cells(2, 10) = j
Cells(3, 10) = k

Close #1



End Sub

用于更快地运行测试的测试文件:

昨天档案:

10001,April,Apple
10002,Book,Bush
10004,Dog,Days
10006,Free,Food
10008,Happy,Help
10009,Ikky,Icing
10010,Jamming,Jupiter

今天文件:

10001,April,Apple
10002,Book,Bush
10003,Cat,Cattle
10004,Dog,Days
10005,Echo,Eggg
10006,Free,Food
10007,Good,Game
10008,Happy,Help
10009,Ikky,Icing
10010,Jamming,Jupiter

注意:在实际数据中没有“唯一ID字段”

计数器以正确的数字结束,所以我知道它正在通过递归

编辑:我知道这可以用其他语言轻松完成,但我只能从我的工作终端访问excel vba,不能将文件从网络上删除。

2 个答案:

答案 0 :(得分:0)

如果我理解,那么在新旧工作簿中使用4个字段,假设每个wb中的行数/任务数相同。它并不漂亮,但您可以评估类似于:

Dim i as Long, j as Long, k as Long, l as Long, m as Long

If wbNew.shNew.Cells(i, 1).Value = wbOld.shOld.CellS(i,1).Value OR wbNew.shNew.Cells(i, 1).Value = wbOld.shOld.CellS(i,2).Value OR wbNew.shNew.Cells(i, 1).Value = wbOld.shOld.CellS(i,3).Value OR wbNew.shNew.Cells(i, 1).Value = wbOld.shOld.CellS(i,4).Value OR Then
    j=1
End If

If wbNew.shNew.Cells(i, 2).Value = wbOld.shOld.CellS(i,1).Value OR wbNew.shNew.Cells(i, 2).Value = wbOld.shOld.CellS(i,2).Value OR wbNew.shNew.Cells(i, 2).Value = wbOld.shOld.CellS(i,3).Value OR wbNew.shNew.Cells(i, 2).Value = wbOld.shOld.CellS(i,4).Value OR Then
    k=1
End If

If wbNew.shNew.Cells(i, 3).Value = wbOld.shOld.CellS(i,1).Value OR wbNew.shNew.Cells(i, 3).Value = wbOld.shOld.CellS(i,2).Value OR wbNew.shNew.Cells(i, 3).Value = wbOld.shOld.CellS(i,3).Value OR wbNew.shNew.Cells(i, 3).Value = wbOld.shOld.CellS(i,4).Value OR Then
    l=1
End If

If wbNew.shNew.Cells(i, 4).Value = wbOld.shOld.CellS(i,1).Value OR wbNew.shNew.Cells(i, 4).Value = wbOld.shOld.CellS(i,2).Value OR wbNew.shNew.Cells(i, 4).Value = wbOld.shOld.CellS(i,3).Value OR wbNew.shNew.Cells(i, 4).Value = wbOld.shOld.CellS(i,4).Value Then
    m=1
End If

If (i+j+k+l)=4 Then
    wbNew.shNew.Rows(i).Interior.Color=2
End If

j=0
k=0
l=0
m=0

使用wbNew和shNew表示使用最新的工作簿,并使用wbOld和shOld表示昨天的工作簿。这将全部在循环内,你需要找到最后一行。

您还可以采用一种方法来使用Find()或Match(),例如:

z = Application.Match(wbNew.shNew.Cells(i,1),wbOld.sheOld.Columns(1)).Row
If wbNew.shNew.Cells(i,1).Value = wbOld.shOld.Cells(z, 1).Value OR wbNew.shNew.Cells(i,1).Value = wbOld.shOld.Cells(z, 2).Value OR wbNew.shNew.Cells(i,1).Value = wbOld.shOld.Cells(z, 3).Value OR wbNew.shNew.Cells(i,1).Value = wbOld.shOld.Cells(z, 4).Value Then
    j=1
End If

如果每行有任何唯一的内容,情况会更多,所以你可以找到z,然后进行比较只需要遍历一个工作簿/表。

修改

添加循环遍历列和行(嵌套循环)的示例,并将单元格内部标记为标记为true:

Dim r as long, c as Long

For r = 1 to LR
    For c = 1 to LC
        If Cells(r, c).Value = "Moo" Then
            If Cells(r, c).Interior.Color <> 2 Then
                Cells(r, c).Interior.Color=2
            End If
        End If
    Next c
Next r

使用维度制作要使用的单元格可能会有所帮助,例如:

Dim y as Variant

y = wbNew.shNew.Cells(r, c).Value

这使编辑更容易,imo。

答案 1 :(得分:0)

所以在经过多次试验和错误之后,我回答了我自己的问题,感谢所有的回复,但从72mil迭代到3bil只是不是一种选择。

我的代码最终看起来像

Sub test()
'Freeze window
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

'open files to edit
Dim fileA, fileB As String
fileA= Application.GetOpenFilename()
fileB = Application.GetOpenFilename()

'setting variables
Dim lineA, lineB, DQ As String 'read in lines and double quote     variables
Dim i, sameLine As Integer 'row counter and testing(could have used boolean?)
Dim newLine 'object creation for array of line
i = 1
DQ = Chr(34) 'character 34 is "

Open fileA For Input As #1 'open file 1 for append
Do Until EOF(1) 'Outter loop to run through file 1
    sameLine = 0 'reset write operator
    Open fileB For Input As #2 'open file 2 for append
    Line Input #1, lineA'read in line from file 1
    Do Until EOF(2) 'inner loop to run through file 2
        Line Input #2, lineB 'read in line from file 2
        If StrComp(lineA, lineB) = 0 Then 'compare lines in files if same then flag write operator to 1
            sameLine = 1
        End If
    Loop
    If sameLine = 0 Then 'if write operator is not active then output line
        count = Len(lineA) - Len(Replace(lineA, "|", ""))     'count number of columns needed for output
        lineA= Replace(lineA, DQ, "") 'removing all double     quotes from line
        newLine = Split(lineA, "|") 'spliting line into object with | as delimiter
        For counter = 1 To count 'placing line in row
            Cells(i, counter) = newLine(counter - 1)
        Next counter
        i = i + 1 'counter for cells
    End If
Close #2
Loop
Close #1
'unfreezing window
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub

这适用于.txt和.csv,并在将输入行写入任何单元格之前直接比较输入行。我的问题是独特的,因为每行末尾都有时间戳,我添加了几行来修复。