我正在尝试将(90%)部分文本字符串从工作表列匹配到另一个工作表列,并将最终结果带到主工作表列。 我发现了一个VBA解决方案,但我遇到了一些问题。 1)它匹配精确的文本 2)找到匹配两个不同表格列的问题。
请帮我解决这个问题。
Sub lookup()
Dim TotalRows As Long
Dim rng As Range
Dim i As Long
'Copy lookup values from sheet1 to sheet3
Sheets("BANK STATEMENT ENTRY").Select
TotalRows = ActiveSheet.UsedRange.Rows.Count
Range("F3:F" & TotalRows).Copy Destination:=Sheets("TEST").Range("A1")
'Go to the destination sheet
Sheets("TEST").Select
For i = 1 To TotalRows
'Search for the value on sheet2
Set rng = Sheets("INFO").UsedRange.Find(Cells(i, 1).Value)
'If it is found put its value on the destination sheet
If Not rng Is Nothing Then
Cells(i, 2).Value = rng.Value
End If
Next
End Sub
答案 0 :(得分:0)
我已经完成了一个文本挖掘项目,我知道你不能使用这种方法,你必须将字符串分解为子字符串,然后分析它们。这将是一个完整的项目,但你很幸运,因为我为你做了。
让我们简化问题并说你有两个字符串范围,并且你想找到两个组之间的每个相似的字符串。此外,您希望具有最小化匹配对的容差。
假设ABCDE和12BCD00。它们共有B,C,D,BC,CD和BCD。因此,最长的公共子串是BCD,其为3个字符:3 / ABCDE(5)的长度与第一个字符串的60%相似性和3/7 = 43%的相似性。因此,如果您可以在两个范围内的所有字符串中获取所有这些常见子字符串的列表,您可以提供更好的列表来过滤并获得您想要的内容。
我写了一堆函数。要轻松使用它,只需将两组字符串复制并粘贴到一个工作表中,并在同一工作表上生成最终报告,以了解其工作原理。
函数FuzzyFind,找到所有常见的子字符串,并为您提供Group1 / range1中的第一个字符串,group2 / range2中的第二个字符串,公共子字符串以及两个字符串的相似百分比。好的是你可以告诉函数你想要的子串有多小,例如:在前面的例子中,如果你说iMinCommonSubLength = 3,它只会给你BCD,如果你说iMinCommonSubLength = 2它会给你BC,CD和BCD等等。
使用功能Main。我还包括一个测试子。
<强>功能强>
Sub TestIt()
Call Main(ActiveSheet.Range("A1:A10"), ActiveSheet.Range("B1:B10"), 4, ActiveSheet.Range("D1"))
End Sub
Sub Main(rng1 As Range, rng2 As Range, iMinCommonSubLength As Integer, Optional rngReportUpperLeftCell As Range)
Dim arr() As Variant
Dim rngReport As Range
If rngReport Is Nothing Then Set rngReport = ActiveSheet.Range("A1")
arr = FuzzyFind(rng1, rng2, iMinCommonSubLength)
Set rngReport = rngReportUpperLeftCell.Resize(UBound(arr, 1), UBound(arr, 2))
rngReport.Value = arr
rngReport.Columns(1).NumberFormat = "@"
rngReport.Columns(2).NumberFormat = "@"
rngReport.Columns(3).NumberFormat = "@"
rngReport.Columns(4).NumberFormat = "0%"
rngReport.Columns(5).NumberFormat = "0%"
End Sub
Function GetCharacters(str As String) As Variant
Dim arr() As String
ReDim arr(Len(str) - 1)
For i = 1 To Len(str)
arr(i - 1) = Mid$(UCase(str), i, 1)
Next
GetCharacters = arr
End Function
Function GetIterations(iStringLength As Integer, iSubStringLength As Integer) As Integer
If iStringLength >= iSubStringLength Then
GetIterations = iStringLength - iSubStringLength + 1
Else
GetIterations = 0
End If
End Function
Function GetSubtrings(str As String, iSubLength As Integer) As Variant
Dim i As Integer
Dim count As Integer
Dim arr() As Variant
count = GetIterations(Len(str), iSubLength)
ReDim arr(1 To count)
For i = 1 To count
arr(i) = Mid(str, i, iSubLength)
Next i
GetSubtrings = arr()
End Function
Function GetLongestCommonSubStrings(str1 As String, str2 As String, iMinCommonSubLeng As Integer)
Dim i As Integer
Dim iLongestPossible As Integer
Dim iShortest As Integer
Dim arrSubs() As Variant
Dim arr1() As Variant
Dim arr2() As Variant
ReDim arrSubs(1 To 1)
'Longest possible common substring length is the smaller string's length
iLongestPossible = IIf(Len(str1) > Len(str2), Len(str2), Len(str1))
If iLongestPossible < iMinCommonSubLeng Then
'MsgBox "Minimum common substring length is larger than the shortest string." & _
' " You have to choose a smaller common length", , "Error"
Else
'We will try to find the first match of common substrings of two given strings, exit after the first match
For i = iLongestPossible To iMinCommonSubLeng Step -1
arr1 = GetSubtrings(str1, i)
arr2 = GetSubtrings(str2, i)
ReDim arrSubs(1 To 1)
arrSubs = GetCommonElement(arr1, arr2)
If arrSubs(1) <> "" Then Exit For 'if you want JUST THE LONGEST MATCH, comment out this line
Next i
End If
GetLongestCommonSubStrings = arrSubs
End Function
Function GetCommonElement(arr1() As Variant, arr2() As Variant) As Variant
Dim i As Integer
Dim j As Integer
Dim count As Integer
Dim arr() As Variant
count = 1
ReDim arr(1 To count)
For i = 1 To UBound(arr1)
For j = 1 To UBound(arr2)
If arr1(i) = arr2(j) Then
ReDim Preserve arr(1 To count)
arr(count) = arr1(i)
count = count + 1
End If
Next j
Next i
GetCommonElement = arr
End Function
Function FuzzyFind(rng1 As Range, rng2 As Range, iMinCommonSubLength As Integer) As Variant
Dim count As Integer
Dim i As Integer
Dim arrSubs As Variant
Dim str1 As String
Dim str2 As String
Dim cell1 As Range
Dim cell2 As Range
Dim rngReport As Range
Dim arr() As Variant 'array of all cells that are partially matching, str1, str2, common string, percentage
count = 1
ReDim arr(1 To 5, 1 To count)
For Each cell1 In rng1
str1 = UCase(CStr(cell1.Value))
If str1 <> "" Then
For Each cell2 In rng2
str2 = UCase(CStr(cell2.Value))
If str2 <> "" Then
ReDim arrSubs(1 To 1)
arrSubs = GetLongestCommonSubStrings(str1, str2, iMinCommonSubLength)
If arrSubs(1) <> "" Then
For i = 1 To UBound(arrSubs)
arr(1, count) = cell1.Value
arr(2, count) = cell2.Value
arr(3, count) = arrSubs(i)
arr(4, count) = Len(arrSubs(i)) / Len(str1)
arr(5, count) = Len(arrSubs(i)) / Len(str2)
count = count + 1
ReDim Preserve arr(1 To 5, 1 To count)
Next i
End If
End If
Next cell2
End If
Next cell1
FuzzyFind = TransposeArray(arr)
End Function
Function TransposeArray(arr As Variant) As Variant
Dim arrTemp() As Variant
ReDim arrTemp(LBound(arr, 2) To UBound(arr, 2), LBound(arr, 1) To UBound(arr, 1))
For a = LBound(arr, 2) To UBound(arr, 2)
For b = LBound(arr, 1) To UBound(arr, 1)
arrTemp(a, b) = arr(b, a)
Next b
Next a
TransposeArray = arrTemp
End Function
在生成新报告之前,请不要忘记清除工作表。插入一张桌子并使用其自动过滤器轻松过滤您的东西。
最后但同样重要的是,不要忘记点击复选标记以宣布这是您问题的答案。