计算字符串/数组中字符串出现的次数

时间:2019-01-14 22:38:56

标签: ruby string

我希望返回给定字符串中出现次数最多的所有单词。预期使用以下代码:

t1 = "This is a really really really cool experiment cool really "

frequency = Hash.new(0)
words = t1.split
words.each { |word| frequency[word.downcase] += 1 }
frequency = frequency.map.max_by { |k, v| v }
puts "The words with the most frequencies is '#{frequency[0]}' with 
 a frequency of #{frequency[1]}."

输出为:

The words with the most frequencies is 'really' with 
a frequency of 4.

但是,例如,如果有两个等于最大值的字符串,则该字符串不起作用。例如,如果我在文本中添加三个cool,即使cool的数量也等于四个,它仍将返回相同的输出。

如果您能告诉我这些方法是否也可以在数组而不是字符串上工作,那就太好了。

3 个答案:

答案 0 :(得分:3)

尝试一下。

t1 = "This is a really really really cool cool cool"

第1步:将您的字符串分成多个单词

words = t1.split
  #=> ["This", "is", "a", "really", "really", "really", "cool", "cool", "cool"] 

第2步:计算频率哈希值

frequency = Hash.new(0) 
words.each { |word| frequency[word.downcase] += 1 } 
frequency
  ##=> {"this"=>1, "is"=>1, "a"=>1, "really"=>3, "cool"=>3} 

第3步:确定最大频率

arr = frequency.max_by { |k, v| v }
  #=> ["really", 3]
max_frequency = arr.last
  #=> 3

步骤4:创建一个包含频率为max_frequency

的单词的数组
arr = frequency.select { |k, v| v == max_frequency }
  #=> {"really"=>3, "cool"=>3} 
arr.map { |k, v| k }
  #=> ["really", "cool"] 

用Ruby编写的常规方式

words = t1.split
  #=> ["This", "is", "a", "really", "really", "really", "cool", "cool", "cool"] 
frequency = words.each_with_object(Hash.new(0)) do |word, f|
   f[word.downcase] += 1
end
  #=> {"this"=>1, "is"=>1, "a"=>1, "really"=>3, "cool"=>3} 
max_frequency = frequency.max_by(&:last).last
  #=> 3 
frequency.select { |k, v| v == max_frequency }.map(&:first)
  #=> ["really", "cool"]

注释

  1. e = [1,2,3].map #=> #<Enumerator: [1, 2, 3]:map>。这告诉我们frequency.map.max_by { |k,v| v }frequency.max_by { |k,v| v }相同。
  2. frequency = frequency.map.max_by {|k, v| v }中,右边的frequency是一个哈希;左侧的frequency是一个数组。通常,以这种方式重用变量是不明智的做法。
  3. 通常将frequency.max_by { |k,v| v }frequency.max_by { |_,v| v }写成frequency.max_by { |_k,v| v }frequency.max_by(&:last),主要是向读者发信号,表示在块计算中未使用第一个块变量。 (如上所述,该语句通常写为_。)注意frequency.max_by { |k, v| v }.last是有效的局部变量。
  4. frequency.map { |k, v| v }.max可以改写为map,但这有一个缺点,即frequence.size产生一个{{1}}个元素的中间数组,而前者产生一个两个元素的中间数组。

答案 1 :(得分:0)

您已经找到了最常去的地方

greatest_frequency = frequency.max_by {|_, v| v }

让我们用它来找到所有具有这种频率的单词

most_frequent_words = frequency.select { |_, v| v == greatest_frequency }.keys
puts "The words with the most frequencies are #{most_frequent_words.join(', ')} with a frequency of #{greatest_frequency}."

答案 2 :(得分:0)

string = 'This is is a really a really a really cool cool experiment a cool cool really'

1)。将字符串分成单词数组

words = string.split.map(&:downcase)

2)。根据唯一词计算最大频率

max_frequency = words.uniq.map { |i| words.count(i) }.max

3)。查找单词和频率的组合

combos = words.group_by { |e| e }.map { |k, v| [k, v.size] }.to_h

4)。选择最常用的单词

most_frequent_words = combos.select { |_, v| v == max_frequency }.keys

结果

puts "The words with the most frequencies are '#{most_frequent_words.join(', ')}' with a frequency of #{max_frequency}."
#=> The words with the most frequencies are 'a, really, cool' with a frequency of 4.