字数统计返回一个数组(形式为[word,count]的数组),表示每个单词的频率

时间:2014-09-30 00:32:06

标签: ruby

str = 'put returns between paragraph put returns between paragraph put returns between paragraph'

def word_count(string)
  resut= []
  return result = string.split.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
end

def parse_word(word)
  word.gsub!(/[^a-zA-Z0-9]/, " ")
  word.downcase!
  @yoo=    word
end

result =word_count(str)
print result, "\n\n"
res2 = result.select { |pair| pair[1] > 1 }  `#Error coming` 

我正在使用OutPut **

  • 输出

**

{"put"=>3, "returns"=>3, "between"=>3, "paragraph"=>3} 

我需要OutPut像这样 **

  • 输出

**

{"put"=>3, "returns"=>3, "between"=>3, "paragraph"=>3}  

put: 3
returns: 3
between: 3

但主要的问题是他给了我们这样做的代码,但我无法理解它

我不知道这个代码会做什么,任何人都可以帮助我...并修改它以便它可以工作

以下处理put的第一段返回...请注意,ss是在本段中至少出现两次的那些单词的数组。

nect = ss.select { |p| p[1] > 1 }
nect .sort.each do |key, count|
 puts "#{key}: #{count}"
end

1 个答案:

答案 0 :(得分:0)

模块WordCount     def self.word_count(s)         count_frequency(words_from_string(S))     端

def self.word_count_from_file(filename)
    s = File.open(filename) { |file| file.read }
    word_count(s)
end

def self.words_from_string(s)
    s.downcase.scan(/[\w']+/)
end

def self.count_frequency(words)
    counts = Hash.new(0)
    for word in words
        counts[word] += 1
    end
    # counts.to_a.sort {|a,b| b[1] <=> a[1]}
    # sort by decreasing count, then lexicographically
    counts.to_a.sort do |a,b|
        [b[1],a[0]] <=> [a[1],b[0]]
    end
end

def word_count(s)     WordCount.word_count(S) 端