我正在尝试搜索两个多维数组以查找给定子数组中的任何共同元素,然后将结果放在第三个数组中,其中具有相似元素的整个子数组被组合在一起(不仅仅是相似的元素)。 / p>
从两个CSV导入数据:
require 'csv'
array = CSV.read('primary_csv.csv')
#=> [["account_num", "account_name", "primary_phone", "second_phone", "status],
#=> ["11111", "John Smith", "8675309", " ", "active"],
#=> ["11112", "Tina F.", "5551234", "5555678" , "disconnected"],
#=> ["11113", "Troy P.", "9874321", " ", "active"]]
# and so on...
second_array = CSV.read('customer_service.csv')
#=> [["date", "name", "agent", "call_length", "phone", "second_phone", "complaint"],
#=> ["3/1/15", "Mary ?", "Bob X", "5:00", "5551234", " ", "rude"],
#=> ["3/2/15", "Mrs. Smith", "Stew", "1:45", "9995678", "8675309" , "says shes not a customer"]]
# and so on...
如果primary.csv
和customer_service.csv
上的子数组中存在任何数字作为元素,我想要整个子数组(而不仅仅是公共元素),放入第三个数组,{ {1}}。基于上述样本的愿望输出是:
results_array
然后我想将数组导出到一个新的CSV中,其中每个子数组都是它自己的CSV行。我打算通过将results_array = [["11111", "John Smith", "8675309", " ", "active"],
["3/2/15", "Mrs. Smith", "Stew", "1:45", "9995678", "8675309" , "says shes not a customer"]] # and so on...
与,
连接起来对每个子阵列进行迭代,使其以逗号分隔,然后将结果放入新的CSV中:
results_array.each do {|j| j.join(",")}
File.open("results.csv", "w") {|f| f.puts results_array}
#=> 11111,John Smith,8675309, ,active
#=> 3/2/15,Mrs. Smith,Stew,1:45,9995678,8675309,says shes not a customer
# and so on...
如何实现所需的输出?我知道最终产品看起来很混乱,因为类似的数据(例如,电话号码)将在不同的列中。但我需要找到一种将数据组合在一起的方法。
答案 0 :(得分:0)
假设a1
和a2
是两个数组(不包括标题行)。
<强>代码强>
def combine(a1, a2)
h2 = a2.each_with_index
.with_object(Hash.new { |h,k| h[k] = [] }) { |(arr,i),h|
arr.each { |e| es = e.strip; h[es] << i if number?(es) } }
a1.each_with_object([]) do |arr, b|
d = arr.each_with_object([]) do |str, d|
s = str.strip
d.concat(a2.values_at(*h2[s])) if number?(s) && h2.key?(s)
end
b << d.uniq.unshift(arr) if d.any?
end
end
def number?(str)
str =~ /^\d+$/
end
示例强>
以下是您的示例,稍加修改:
a1 = [
["11111", "John Smith", "8675309", "", "active" ],
["11112", "Tina F.", "5551234", "5555678", "disconnected"],
["11113", "Troy P.", "9874321", "", "active" ]
]
a2 = [
["3/1/15", "Mary ?", "Bob X", "5:00", "5551234", "", "rude"],
["3/2/15", "Mrs. Smith", "Stew", "1:45", "9995678", "8675309", "surly"],
["3/7/15", "Cher", "Sonny", "7:45", "9874321", "8675309", "Hey Jude"]
]
combine(a1, a2)
#=> [[["11111", "John Smith", "8675309", "",
# "active"],
# ["3/2/15", "Mrs. Smith", "Stew", "1:45",
# "9995678", "8675309", "surly"],
# ["3/7/15", "Cher", "Sonny", "7:45",
# "9874321", "8675309", "Hey Jude"]
# ],
# [["11112", "Tina F.", "5551234", "5555678",
# "disconnected"],
# ["3/1/15", "Mary ?", "Bob X", "5:00",
# "5551234", "", "rude"]
# ],
# [["11113", "Troy P.", "9874321", "",
# "active"],
# ["3/7/15", "Cher", "Sonny", "7:45",
# "9874321", "8675309", "Hey Jude"]
# ]
# ]
<强>解释强>
首先,我们定义一个帮助器:
def number?(str)
str =~ /^\d+$/
end
例如:
number?("8675309") #=> 0 ("truthy)
number?("3/1/15") #=> nil
现在对表示数字的值进行索引a2
:
h2 = a2.each_with_index
.with_object(Hash.new { |h,k| h[k] = [] }) { |(arr,i),h|
arr.each { |e| es = e.strip; h[es] << i if number?(es) } }
#=> {"5551234"=>[0], "9995678"=>[1], "8675309"=>[1, 2], "9874321"=>[2]}
例如,这表示&#34;数字&#34;领域&#34; 8675309&#34;包含在a2
的抵消1和2的元素中(即,对于史密斯夫人和雪儿)。
我们现在可以简单地浏览a1
寻找匹配项的元素。
代码:
arr.each_with_object([]) do |str, d|
s = str.strip
d.concat(a2.values_at(*h2[s])) if number?(s) && h2.key?(s)
end
逐步执行arr
的元素,将每个元素分配给块变量str
。例如,如果arr
占有a1
str
的第一个元素,则将等于"11111"
,"John Smith"
,依此类推。在s = str.strip
之后,这表示如果s
具有数字表示并且h2
中存在匹配键,则(最初为空)数组d
与元素连接由a2
的值给出的h2[s]
。
完成此循环后,我们会看到d
是否包含a2
的所有元素:
b << d.uniq.unshift(arr) if d.any?
如果是,我们删除重复项,在数组前添加arr
并将其保存到b
。
请注意,这允许a2
的一个元素匹配a1
的多个元素。