我试图找到一种方法来比较每个单元格中的第一个重要单词与下一个单元格中的第一个重要单词,如果第一个重要单词匹配,则删除第二个条目。例如,起始数据可能如下所示:
通用电气
通用电气公司
通用电气公司
微软
Microsoft Corporation
Microsoft服务器
任天堂
任天堂企业
结果应该看起来像这样:
通用电气
微软
任天堂
到目前为止,我已经设置了遍历数据列的代码:
Sub CompanyNameConsolidate()
Dim companyName As String
Dim companyArray() As String
Dim companyName2 As String
Dim companyArray2() As String
Dim totalArray() As String
Dim wordCount As Integer
Dim i As Integer
Dim r As Range
With Sheets("Unassigned")
Range("B1").Select
Do Until IsEmpty(ActiveCell)
companyName = Range("B" & ActiveCell.Row).Text
companyName2 = ActiveCell.Offset(1, 0).Text
companyArray = Split(companyName, " ")
companyArray2 = Split(companyName2, " ")
wordCount = UBound(companyArray) - LBound(companyArray)
For i = 0 To wordCount
If companyArray(i) = companyArray2(i) Then
[*********HELP**********]
Next
ActiveCell.Offset(1, 0).Select
Loop
End With
End Sub
基本上,上面的代码将每个单元格中的子串与下一个单元格中的子串进行比较。不幸的是,就我而言。
棘手的是,一些公司名称可以有两个单词(通用电气),其他公司名称只能有一个单词(微软)。
您可以假设列表将按字母顺序排序,因此最短的名称(我想保留的名称)将始终位于顶部。
我有超过16,000个条目要经历和修复,所以我绝对必须有一个自动化的方法来做到这一点!
答案 0 :(得分:1)
找到一个公司名称的常见恶魔,根据你的例子,这似乎只是删除了最后一个单词,如果它超过1个单词。
Dim listOfCompanies As New Collection
Dim companyName As String
Dim companyArray As Variant
Dim item As Variant
Dim i as Integer, j As Integer
'The 2 denotes column B, where i denotes the row
'You can change this outter loop to your specific needs, this one just processes the first column B1, to when it encounters a blank row
while(ThisWorkbook.Worksheets("Unassigned").Cells(i, 2).Value <> "")
companyName = ThisWorkbook.Worksheets("Unassigned").Cells(i, 2).Value
companyArray = Split(companyName, " ")
companyName = ""
'This truncates the last word off
for j = 0 to UBound(companyArray) - 1
companyName = companyName + companyArray(j) + " "
next j
'Trim off the last space character
companyName = Trim(companyName)
'Now Add your companyName string to a Dictionary Object
'VBA will throw an error if a duplicate gets added, but this is okay and we can continue processing
On Error Resume Next
listOfCompanies.Add(companyName)
On Error Goto 0 'This resets the handler in case an error occurs somewhere else unexpectedly
i = i + 1
wend
'Now we can do a ForEach and spit out the entire 'unique list'
For Each item in listOfCompanies
'Your code here
Next item
答案 1 :(得分:1)
使用行删除的第一个变体:
Sub test()
Dim Dic As Object: Set Dic = CreateObject("Scripting.Dictionary")
Dim x&, cl As Range, DataRange As Range, k1, k2
Dic.comparemode = vbTextCompare
With Sheets("Unassigned")
Set DataRange = .[B1].Resize(.Cells(Rows.Count, "B").End(xlUp).Row, 2)
x = 1
For Each cl In DataRange
If cl.Value <> "" Then
Dic.Add x, cl.Value
x = x + 1
End If
Next cl
For Each k1 In Dic
For Each k2 In Dic
If IsNumeric(k1) And IsNumeric(k2) Then
If Dic(k2) Like Dic(k1) + "*" And k2 > k1 Then
Dic.Remove (k2)
End If
If Not Dic.exists(Dic(k1)) Then Dic.Add Dic(k1), Nothing
End If
Next k2, k1
x = Split(DataRange.Address, "$")(4)
While x <> 0
If Not Dic.exists(.Cells(x, "B").Value) Then .Rows(x).Delete
x = x - 1
Wend
End With
End Sub
使用Workbook.Add
的第二个变体:
Sub test2()
Dim Dic As Object: Set Dic = CreateObject("Scripting.Dictionary")
Dim x&, cl As Range, DataRange As Range, k1, k2
Dic.comparemode = vbTextCompare
With Sheets("Unassigned")
Set DataRange = .[B1].Resize(.Cells(Rows.Count, "B").End(xlUp).Row, 2)
x = 1
For Each cl In DataRange
If cl.Value <> "" Then
Dic.Add x, cl.Value
x = x + 1
End If
Next cl
For Each k1 In Dic
For Each k2 In Dic
If Dic(k2) Like Dic(k1) + "*" And k2 > k1 Then
Dic.Remove (k2)
End If
Next k2, k1
End With
Workbooks.Add
x = 1
For Each k1 In Dic
Cells(x, 2) = Dic(k1)
x = x + 1
Next k1
End Sub
测试两种变体