我在Excel中有一个列表,其中一个子集如下所示:
Food and Human Nutrition
Food and Human Nutrition with Placement
Food and Nutrition with Professional Experience
Food Marketing and Nutrition
Food Marketing and Nutrition with Placement
Food, Nutrition and Health
我想在此列表中找到n
最常用的字词。我尝试用这个公式找到最常用的词:
=INDEX(rng,MODE(MATCH(rng,rng,0)))
这个问题是它将每个单元格视为单个字符串,并且由于6行中的每一行都不同,因此找不到最常见的单词。我想要它做的是输出'食物','营养'和'和'作为最常用的词,然后是'市场营销','安置','带'等。
答案 0 :(得分:1)
如果你知道&想要使用VBA,那么这将是一个非常简单的任务。因此,像=MostCommonWords(Range;Optional WordsNumber)
这样的自定义公式可以为您提供以下结果:
这是公式背后的代码:
Public Function MostCommonWords(inputRange As Range, _
Optional NumberOfWords As Long = 1) As String
Dim myCell As Range
Dim inputString As String, tempString As String, myResult As String
Dim myArr As Variant, myKey As Variant
Dim cnt As Long, topNumber As Long
Dim myColl As Object
Set myColl = CreateObject("Scripting.Dictionary")
For Each myCell In inputRange
tempString = LCase(Replace(myCell, ",", ""))
inputString = inputString & " " & tempString
Next myCell
myArr = Split(inputString)
For cnt = LBound(myArr) To UBound(myArr)
If myColl.exists(myArr(cnt)) Then
myColl(myArr(cnt)) = myColl(myArr(cnt)) + 1
Else
myColl.Add myArr(cnt), 1
End If
Next cnt
For cnt = 1 To NumberOfWords
topNumber = 0
myResult = vbNullString
For Each myKey In myColl
If topNumber < myColl(myKey) Then
topNumber = myColl(myKey)
myResult = myKey
End If
Next myKey
MostCommonWords = MostCommonWords & " " & myResult
myColl.Remove myResult
Next cnt
End Function
它是如何运作的?
inputString
的字符串中。 myColl.Remove myResult
。答案 1 :(得分:1)
这是一个VBA宏,提供您想要的内容。
在代码中仔细阅读评论,以了解需要做出的假设。并且需要设置参考
另请注意,标点符号可能会导致相同的单词计入不同的类别。如果这可能是一个问题,我们只需要以不同方式拆分源数据,或者在拆分空格之前消除所有标点符号,或者使用正则表达式进行拆分。
'Set Reference to Microsoft Scripting Runtime
Option Explicit
Sub UniqueWordCounts()
Dim wsSrc As Worksheet, wsRes As Worksheet
Dim rSrc As Range, rRes As Range
Dim vSrc As Variant, vRes As Variant
Dim vWords As Variant
Dim dWords As Dictionary
Dim I As Long, J As Long
Dim V As Variant, vKey As Variant
'Assume source data is in column 1, starting at A1
' Could easily be anyplace
Set wsSrc = Worksheets("sheet2")
With wsSrc
Set rSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp))
End With
'Results to go a few columns over
Set wsRes = Worksheets("sheet2")
Set rRes = rSrc(1, 1).Offset(0, 2)
'Read source data into vba array (for processing speed)
vSrc = rSrc
'Collect individual words and counts into dictionary
Set dWords = New Dictionary
dWords.CompareMode = TextCompare
For I = 1 To UBound(vSrc, 1)
'Split the sentence into individual words
For Each vKey In Split(vSrc(I, 1))
If Not dWords.Exists(vKey) Then
dWords.Add Key:=vKey, Item:=1
Else
dWords(vKey) = dWords(vKey) + 1
End If
Next vKey
Next I
'Size results array
ReDim vRes(0 To dWords.Count, 1 To 2)
'Column headers
vRes(0, 1) = "Word"
vRes(0, 2) = "Count"
'Populate the columns
I = 0
For Each V In dWords.Keys
I = I + 1
vRes(I, 1) = V
vRes(I, 2) = dWords(V)
Next V
'Size results range
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
'Populate, format and sort the Results range
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.EntireColumn.AutoFit
.Sort key1:=.Columns(2), order1:=xlDescending, key2:=.Columns(1), order2:=xlAscending, MatchCase:=False, Header:=xlYes
End With
End Sub
答案 2 :(得分:0)
最简单的方法可能是使用一致性程序(比如用Word),但也很简单就是在Word中转换为单个列表然后在Excel中进行数据透视。如果表仅在空格上拆分,Food,
和Food
将显示为不同的单词,因此建议首先删除标点符号(查找/替换)。