我希望返回给定字符串中出现次数最多的所有单词。预期使用以下代码:
t1 = "This is a really really really cool experiment cool really "
frequency = Hash.new(0)
words = t1.split
words.each { |word| frequency[word.downcase] += 1 }
frequency = frequency.map.max_by { |k, v| v }
puts "The words with the most frequencies is '#{frequency[0]}' with
a frequency of #{frequency[1]}."
输出为:
The words with the most frequencies is 'really' with
a frequency of 4.
但是,例如,如果有两个等于最大值的字符串,则该字符串不起作用。例如,如果我在文本中添加三个cool
,即使cool
的数量也等于四个,它仍将返回相同的输出。
如果您能告诉我这些方法是否也可以在数组而不是字符串上工作,那就太好了。
答案 0 :(得分:3)
尝试一下。
t1 = "This is a really really really cool cool cool"
第1步:将您的字符串分成多个单词
words = t1.split
#=> ["This", "is", "a", "really", "really", "really", "cool", "cool", "cool"]
第2步:计算频率哈希值
frequency = Hash.new(0)
words.each { |word| frequency[word.downcase] += 1 }
frequency
##=> {"this"=>1, "is"=>1, "a"=>1, "really"=>3, "cool"=>3}
第3步:确定最大频率
arr = frequency.max_by { |k, v| v }
#=> ["really", 3]
max_frequency = arr.last
#=> 3
步骤4:创建一个包含频率为max_frequency
arr = frequency.select { |k, v| v == max_frequency }
#=> {"really"=>3, "cool"=>3}
arr.map { |k, v| k }
#=> ["really", "cool"]
用Ruby编写的常规方式
words = t1.split
#=> ["This", "is", "a", "really", "really", "really", "cool", "cool", "cool"]
frequency = words.each_with_object(Hash.new(0)) do |word, f|
f[word.downcase] += 1
end
#=> {"this"=>1, "is"=>1, "a"=>1, "really"=>3, "cool"=>3}
max_frequency = frequency.max_by(&:last).last
#=> 3
frequency.select { |k, v| v == max_frequency }.map(&:first)
#=> ["really", "cool"]
注释
e = [1,2,3].map #=> #<Enumerator: [1, 2, 3]:map>
。这告诉我们frequency.map.max_by { |k,v| v }
与frequency.max_by { |k,v| v }
相同。frequency = frequency.map.max_by {|k, v| v }
中,右边的frequency
是一个哈希;左侧的frequency
是一个数组。通常,以这种方式重用变量是不明智的做法。frequency.max_by { |k,v| v }
或frequency.max_by { |_,v| v }
写成frequency.max_by { |_k,v| v }
或frequency.max_by(&:last)
,主要是向读者发信号,表示在块计算中未使用第一个块变量。 (如上所述,该语句通常写为_
。)注意frequency.max_by { |k, v| v }.last
是有效的局部变量。frequency.map { |k, v| v }.max
可以改写为map
,但这有一个缺点,即frequence.size
产生一个{{1}}个元素的中间数组,而前者产生一个两个元素的中间数组。答案 1 :(得分:0)
您已经找到了最常去的地方
greatest_frequency = frequency.max_by {|_, v| v }
让我们用它来找到所有具有这种频率的单词
most_frequent_words = frequency.select { |_, v| v == greatest_frequency }.keys
puts "The words with the most frequencies are #{most_frequent_words.join(', ')} with a frequency of #{greatest_frequency}."
答案 2 :(得分:0)
string = 'This is is a really a really a really cool cool experiment a cool cool really'
1)。将字符串分成单词数组
words = string.split.map(&:downcase)
2)。根据唯一词计算最大频率
max_frequency = words.uniq.map { |i| words.count(i) }.max
3)。查找单词和频率的组合
combos = words.group_by { |e| e }.map { |k, v| [k, v.size] }.to_h
4)。选择最常用的单词
most_frequent_words = combos.select { |_, v| v == max_frequency }.keys
结果
puts "The words with the most frequencies are '#{most_frequent_words.join(', ')}' with a frequency of #{max_frequency}."
#=> The words with the most frequencies are 'a, really, cool' with a frequency of 4.