Question

我目前正在开展逻辑挑战，作为项目的一部分，并且已经工作了几个小时试图解决它。我有：

data = [
  ["this is a list of words", "2"],
  ["another list of words", "2"]
]

我想回复一下：

data = [
  ["this", "2"],
  ["is", "2"],
  ["a", "2"],
  ["list", "4"],
  ["of", "4"],
  ["another", "2"],
  ["words", "4"]
]

基本上，索引位置[0]中的单词串会被拆分，任何重复项都会被删除，但如果有重复，则会添加索引[1]。

我尝试过很多东西，分裂，使用预测以及无数次迭代，但一切似乎都走到了尽头。我确信有一个简单的解决方案。

这是我最近的尝试：

  #Loop through each data item
  data.each do |obj| 
    # create each obj to an array and save to var
    newObj = obj.permutation(1).to_a 
    # loop through array of words and split storing the count
    split_query = newObj[0].each do |e| 
     query_count = e.split(' ').count
     print e.split(' ')
    end
  end

Answer 1

您可以使用字典：

hash = Hash.new {0}
data.each do |v|
  x = v[1].to_i
  v[0].split.each do |word|
    hash[word] += x
  end
end
result = hash.map {|k,v| [k, v.to_s]}

产量：

result
=> [["this", "2"],
    ["is", "2"],
    ["a", "2"],
    ["list", "4"],
    ["of", "4"],
    ["words", "4"],
    ["another", "2"]]

Answer 2

你可以这样做。

<强>代码

def tally(data)
  data.flat_map { |str,val| str.split.product([val.to_i]) }.
       group_by(&:first).
       map { |_,arr| [arr.first.first, arr.reduce(0) { |t,(_,val)| t+val }.to_s] }
end

示例

data = [ ["this is a list of words", "2"], ["another list of words", "2"], ["yet one more list", "3"], ["and a final one", "4"]] tally data #=> [["this", "2"], ["is", "2"], ["a", "6"], ["list", "7"], # ["of", "4"], ["words", "4"], ["another", "2"], ["yet", "3"], # ["one", "7"], ["more", "3"], ["and", "4"], ["final", "4"]]

返回具有表示为整数而不是字符串的计数的对可能更有用。

<强>解释

例如，这些是逐步计算：

a = data.flat_map { |str,val| str.split.product([val.to_i]) } #=> [["this", 2], ["is", 2], ["a", 2], ["list", 2], ["of", 2], # ["words", 2], ["another", 2], ["list", 2], ["of", 2], # ["words", 2], ["yet", 3], ["one", 3], ["more", 3], ["list", 3], # ["and", 4], ["a", 4], ["final", 4], ["one", 4]] b = a.group_by(&:first) #=> {"this"=>[["this", 2]], # "is"=>[["is", 2]], # "a"=>[["a", 2], ["a", 4]], # "list"=>[["list", 2], ["list", 2], ["list", 3]], # "of"=>[["of", 2], ["of", 2]], # "words"=>[["words", 2], ["words", 2]], # "another"=>[["another", 2]], # "yet"=>[["yet", 3]], # "one"=>[["one", 3], ["one", 4]], # "more"=>[["more", 3]], # "and"=>[["and", 4]], # "final"=>[["final", 4]]} b.map { |_,arr| [arr.first.first, arr.reduce(0) { |t,(_,val)| t+val }.to_s] } #=> (the result for the example shown above)

哈希替代

在这里使用哈希更自然，值为整数。为此，我们使用Hash::new定义哈希，默认值为零：

def tally(data) data.each_with_object(Hash.new(0)) do |(str,val),h| str.split.each { |word| h[word] += val.to_i } end end h = tally(data) #=> {"this"=>2, "is"=>2, "a"=>6, "list"=>7, "of"=>4, "words"=>4, # "another"=>2, "yet"=>3, "one"=>7, "more"=>3, "and"=>4, "final"=>4}

如果您希望按键按值降序排列：

sorted_keys = h.keys.sort_by { |k| -h[k] } #=> ["one", "list", "a", "of", "and", "words", "final", "yet", # "more", "another", "is", "this"] sorted_keys.zip(h.values_at(*sorted_keys)).to_h #=> {"one"=>7, "list"=>7, "a"=>6, "of"=>4, "and"=>4, "words"=>4, # "final"=>4, "yet"=>3, "more"=>3, "another"=>2, "is"=>2, "this"=>2}

Hash.new(0)通常被称为“计数哈希”。如果：

h = Hash.new(0)

然后：

h[:a] += 1

相当于：

h[:a] = h[:a] + 1

如果h没有密钥:a（h为空的情况），则等式右侧的h[:a]等于哈希值< em>默认值，由new的参数给出，此处为零。因此：

h[:a] = h[:a] + 1 # = 0 + 1 # = 1 h #=> { :a => 1 }

下次我们遇到密钥:a：

h[:a] += 1 #=> h[:a] = h[:a] + 1 #=> = 1 + 1 #=> = 2

总结一些词出现表

2 个答案: