Ruby在字符串中合并重复项

时间:2018-03-01 16:40:44

标签: ruby string duplicates concatenation

如果我有这样的字符串

str =<<END
7312357006,1.121
3214058234,3456
7312357006,1234
1324958723,232.1
3214058234,43.2
3214173443,234.1
6134513494,23.2
7312357006,11.1
END

如果第一个值中的数字再次出现,我想将它们的第二个值加在一起。所以最后的字符串看起来像这样

7312357006,1246.221
3214058234,3499.2
1324958723,232.1
3214173443,234.1
6134513494,23.2

如果最终输出的数组也很好。

4 个答案:

答案 0 :(得分:4)

在Ruby中有很多方法可以做到这一点。一种特别简洁的方法是使用String#scan

str = <<END
7312357006,1.121
3214058234,3456
7312357006,1234
1324958723,232.1
3214058234,43.2
3214173443,234.1
6134513494,23.2
7312357006,11.1
END

data = Hash.new(0)
str.scan(/(\d+),([\d.]+)/) {|k,v| data[k] += v.to_f }
p data
# => { "7312357006" => 1246.221,
#      "3214058234" => 3499.2,
#      "1324958723" => 232.1,
#      "3214173443" => 234.1,
#      "6134513494" => 23.2 }

这使用正则表达式/(\d+),([\d.]+)/从每一行中提取两个值。调用该块作为参数,然后将其合并到散列中。

这也可以使用each_with_object

写成单个表达式
data = str.scan(/(\d+),([\d.]+)/)
         .each_with_object(Hash.new(0)) {|(k,v), hsh| hsh[k] += v.to_f }
# => (same as above)

有很多方法可以打印结果,但这里有一些我喜欢的方式:

puts data.map {|kv| kv.join(",") }.join("\n")
# => 7312357006,1246.221
#    3214058234,3499.2
#    1324958723,232.1
#    3214173443,234.1
#    6134513494,23.2

# or:
puts data.map {|k,v| "#{k},#{v}\n" }.join
# => (same as above)

你可以see all of these in action on repl.it

编辑:虽然为了便于阅读,我不建议使用其中任何一种,但这里更适用于踢(需要Ruby 2.4 +):

data = str.lines.group_by {|s| s.slice!(/(\d+),/); $1 }
         .transform_values {|a| a.sum(&:to_f) }

...或者,直接去一个字符串:

puts str.lines.group_by {|s| s.slice!(/(\d+),/); $1 }
       .map {|k,vs| "#{k},#{vs.sum(&:to_f)}\n" }.join

因为repl.it停留在Ruby 2.3上:Try it online!

答案 1 :(得分:1)

您可以使用each_with_object实现此目的,如下所示:

str = "7312357006,1.121
       3214058234,3456
       7312357006,1234
       1324958723,232.1
       3214058234,43.2
       3214173443,234.1
       6134513494,23.2
       7312357006,11.1"

# convert the string into nested pairs of floats
# to briefly summarise the steps: split entries by newline, strip whitespace, split by comma, convert to floats
arr = str.split("\n").map(&:strip).map { |el| el.split(",").map(&:to_f) }

result = arr.each_with_object(Hash.new(0)) do |el, hash| 
  hash[el.first] += el.last
end

# => {7312357006.0=>1246.221, 3214058234.0=>3499.2, 1324958723.0=>232.1, 3214173443.0=>234.1, 6134513494.0=>23.2}

# You can then call `to_a` on result if you want:
result.to_a

# => [[7312357006.0, 1246.221], [3214058234.0, 3499.2], [1324958723.0, 232.1], [3214173443.0, 234.1], [6134513494.0, 23.2]]

each_with_object遍历每对数据,为他们提供对累加器的访问(在此哈希)。通过遵循这种方法,我们可以将每个条目添加到散列中,如果它们出现多次,则将总数加在一起。

希望有所帮助 - 如果您有任何疑问,请与我联系。

答案 2 :(得分:0)

def combine(str)
  str.each_line.with_object(Hash.new(0)) do |s,h|
    k,v = s.split(',')        
    h.update(k=>v.to_f) { |k,o,n| o+n }
  end.reduce('') { |s,kv_pair| s << "%s,%g\n" % kv_pair }
end 

puts combine str
7312357006,1246.22
3214058234,3499.2
1324958723,232.1
3214173443,234.1
6134513494,23.2

Notes:

  • using String#each_line is preferable to str.split("\n") as the former returns an enumerator whereas the latter returns a temporary array. Each element generated by the enumerator is line of str that (unlike the elements of str.split("\n")) ends with a newline character, but that is of no concern.
  • see Hash::new, specifically when a default value (here 0) is used. If a hash has been defined h = Hash.new(0) and h does not have a key k, h[k] returns the default value, zero (h is not changed). When Ruby encounters the expression h[k] += 1, the first thing she does is expand it to h[k] = h[k] + 1. If h has been defined with a default value of zero, and h does not have a key k, h[k] on the right of the equality (syntactic sugar1 for h.[](k)) returns zero.
  • see Hash#update (aka merge!). h.update(k=>v.to_f) is syntactic sugar for h.update({ k=>v.to_f })
  • see Kernel#sprint for explanations of the formatting directives %s and %g.
  • the receiver for the expression reduce('') { |s,kv_pair| s << "%s,%g\n" % kv_pair } (in the penultimate line), is the following hash.

   {"7312357006"=>1246.221, "3214058234"=>3499.2, "1324958723"=>232.1,
    "3214173443"=>234.1, "6134513494"=>23.2}

1 Syntactic sugar is a shortcut allowed by Ruby.

答案 3 :(得分:-1)

实现此解决方案,因为哈希给了我一些问题:

d = []
s.split("\n").each do |line|
  x = 0
  q = 0
  dup = false
  line.split(",").each do |data|
    if x == 0 and d.include? data then dup = true ; q = d.index(data) elsif x == 0 then d << data end
    if x == 1 and dup == false then d << data end
    if x == 1 and dup == true then d[q+1] = "#{'%.2f' % (d[q+1].to_f + data.to_f).to_s}" end
    if x == 2 and dup == false then d << data end
    x += 1
  end
end

x = 0
s = ""

d.each do |val|
  if x == 0 then s << "#{val}," end
  if x == 1 then s << "#{val}\n ; x = 0" end
  x += 1
end

puts(s)