我有一个包含200多个变量(列)的Excel(2010)数据文件,以及超过1,000条记录(行),每个记录都标有唯一的ID号。但是,我怀疑这些记录中的一些是捏造的,即有人拿现有记录,复制它,只改变了几个数字,使它有点不同。因此,我需要生成一个矩阵,向我显示相同值的数量/百分比"在每个记录和所有其他记录之间(例如,记录1和记录2共享75个相等的值,记录1和记录3共享57个相等的值,记录2和记录3共享45个相等的值等)。我有一些解决方法,但它们需要数小时而且不会产生简单的矩阵。我不关心价值观之间的区别 - 只是它们是否相等。任何想法将不胜感激!
答案 0 :(得分:1)
不知道这对大型数据集的表现如何,但是:
Sub T()
Dim d, m(), nR As Long, nC As Long, r As Long, r2 As Long, c As Long
Dim v1, v2, i As Long
d = Sheet1.Range("A1").CurrentRegion.Value
nR = UBound(d, 1)
nC = UBound(d, 2)
ReDim m(1 To nR, 1 To nR)
For r = 1 To nR
For r2 = r To nR
i = 0
For c = 1 To nC
v1 = d(r, c): If IsError(v1) Then v1 = "Error!"
v2 = d(r2, c): If IsError(v2) Then v2 = "Error!"
If v1 = v2 Then i = i + 1
Next c
m(r2, r) = i
Next r2
Next r
With Sheet2
.Range("B2").Resize(nR, nR).Value = m
'assuming your id's are in the first column...
For r = 1 To nR
.Cells(1 + r, 1) = d(r, 1)
.Cells(r, r + 1) = d(r, 1)
Next r
End With
End Sub
答案 1 :(得分:0)
我对HTML有点沉闷并发帖......不是程序员,所以请宽恕一切......
Sub CalculateDuplicationBetweenRecords()
Dim myCases As Long
Dim myVariables As Long
Dim myCurrentCase As Long
Dim myComparisonCase As Long
Dim myCurrentVariable As Long
Dim myCurrentCell As Long
Dim myComparisonCell As Long
Dim myCounter As Long
' Would be nice to automate number of cases and variables...
myCases = 88
myVariables = 81
' Insert case #1 id for results matrix - cosmetic...
Worksheets("Sheet2").Cells(1, 2).Value = Worksheets("Sheet1").Cells(1, 1).Value
For myCurrentCase = 1 To myCases - 1
For myComparisonCase = myCurrentCase + 1 To myCases
myCounter = 0
For myCurrentVariable = 1 To myVariables
myCurrentCell = Worksheets("Sheet1").Cells(myCurrentCase, myCurrentVariable).Value: If IsError(myCurrentCell) Then myCurrentCell = "Error!"
myComparisonCell = Worksheets("Sheet1").Cells(myComparisonCase, myCurrentVariable).Value: If IsError(myComparisonCell) Then myComparisonCell = "Error!"
If myCurrentCell = myComparisonCell Then myCounter = myCounter + 1
Next myCurrentVariable
Worksheets("Sheet2").Cells(myCurrentCase + 1, 1).Value = Worksheets("Sheet1").Cells(myCurrentCase, 1).Value
Worksheets("Sheet2").Cells(1, myComparisonCase + 1).Value = Worksheets("Sheet1").Cells(myComparisonCase, 1).Value
Worksheets("Sheet2").Cells(myCurrentCase + 1, myComparisonCase + 1).Value = myCounter
Next myComparisonCase
Next myCurrentCase
End Sub