范围内单元格内的单词频率

时间:2016-07-21 17:50:03

标签: excel vba text

我有一个约50个细胞的柱子。每个单元格包含一个文本块,从3-8个句子开始。

我想填充正在使用的单词列表并获取整个范围的频率(A1:A50)。

我试图操纵我在其他帖子中找到的其他代码,但它们似乎是针对包含一个单词而不是多个单词的单元格。

这是我发现我试图使用的代码。

Sub Ftable()
Dim BigString As String, I As Long, J As Long, K As Long
Dim Selection As Range

Set Selection = ThisWorkbook.Sheets("Sheet1").Columns("A")
BigString = ""
For Each r In Selection
     BigString = BigString & " " & r.Value
Next r
BigString = Trim(BigString)
ary = Split(BigString, " ")
Dim cl As Collection
Set cl = New Collection
For Each a In ary
    On Error Resume Next
    cl.Add a, CStr(a)
Next a

For I = 1 To cl.Count
    v = cl(I)
    ThisWorkbook.Sheets("Sheet2").Cells(I, "B").Value = v
    J = 0
    For Each a In ary
        If a = v Then J = J + 1
    Next a
    ThisWorkbook.Sheets("Sheet2").Cells(I, "C") = J
Next I
End Sub

2 个答案:

答案 0 :(得分:1)

在这里,字典是处理此问题的最佳方式我认为您可以测试字典是否已包含项目。如果你有什么事情没有得到,可以回邮。

Sub CountWords()

Dim dictionary As Object
Dim sentence() As String
Dim arrayPos As Integer
Dim lastRow, rowCounter As Long
Dim ws, destination As Worksheet

Set ws = Sheets("Put the source sheet name here")
Set destination = Sheets("Put the destination sheet name here")

rowCounter = 2
arrayPos = 0
lastRow = ws.Range("A1000000").End(xlUp).Row

Set dictionary = CreateObject("Scripting.dictionary")

For x = 2 To lastRow
    sentence = Split(ws.Cells(x, 1), " ")
    For y = 0 To UBound(sentence)
        If Not dictionary.Exists(sentence(y)) Then
            dictionary.Add sentence(y), 1
        Else
            dictionary.Item(sentence(y)) = dictionary.Item(sentence(y)) + 1
        End If
    Next y
Next x

For Each Item In dictionary
    destination.Cells(rowCounter, 1) = Item
    destination.Cells(rowCounter, 2) = dictionary.Item(Item)
    rowCounter = rowCounter + 1
Next Item

End Sub

答案 1 :(得分:0)

尝试这个(适用于我的一些Lorem Ipsum长文本块):

Sub Ftable()
Dim BigString As String, I As Long, J As Long, K As Long
Dim countRange As Range

Set countRange = ThisWorkbook.Sheets("Sheet1").Range("A1:A50")
BigString = ""
For Each r In countRange.Cells
     BigString = BigString & " " & r.Value
Next r
BigString = Trim(BigString)
ary = Split(BigString, " ")
Dim cl As Collection
Set cl = New Collection
For Each a In ary
    On Error Resume Next
    cl.Add a, CStr(a)
Next a

For I = 1 To cl.Count
    v = cl(I)
    ThisWorkbook.Sheets("Sheet2").Cells(I, "B").Value = v
    J = 0
    For Each a In ary
        If a = v Then J = J + 1
    Next a
    ThisWorkbook.Sheets("Sheet2").Cells(I, "C") = J
Next I
End Sub

我把它归结为仅查看您拥有数据的50个单元格,而不是该列中的所有> 100万个单元格。我还解决了一个问题,即r得到长度为1的数组而不是Range。我改名为#34; Selection" to" countRange"因为选择已经在应用程序中定义,所以命名很糟糕。

另外,请注意您的代码来自" Sheet1"并输出到" Sheet2"的B和C列。确保重命名工作表或编辑这些值,否则您将收到错误/数据损坏。

这就是我解决问题的方法:

Sub Ftable()
Dim wordDict As New Dictionary
Dim r As Range
Dim countRange As Range
Dim str As Variant
Dim strArray() As String

Set countRange = ThisWorkbook.Sheets("Sheet1").Range("A1:A50")

For Each r In countRange
    strArray = Split(Trim(r.Value), " ")

    For Each str In strArray
        str = LCase(str)
        If wordDict.Exists(str) Then
            wordDict(str) = wordDict(str) + 1
        Else
            wordDict.Add str, 1
        End If
    Next str
Next r

Set r = ThisWorkbook.Sheets("Sheet2").Range("B1")
For Each str In wordDict.Keys()
    r.Value = str
    r.Offset(0, 1).Value = wordDict(str)
    Set r = r.Offset(1, 0)
Next str

Set wordDict = Nothing
End Sub

它使用字典,因此请确保添加对库的引用(工具>添加参考> Microsoft脚本库)。它还强制一切都是小写的 - 旧代码的一个大问题是它没有正确计算大写和非大写的版本,这意味着它错过了很多单词。如果您不想要,请移除str = LCase(str)

额外奖励:这种方法在我的测试表上跑了大约8倍。