我有一个包含数组的数组,包含一个键和一个时间戳。
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 22:00:51 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 22:00:32 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:58:33 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:58:01 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:58:51 CEST +02:00],
["3wyadsrrdxtgieyxx_lgka", Sat, 13 May 2017 01:09:01 CEST +02:00],
["y-5he42vlloggjb_whm8jw", Sat, 22 Apr 2017 22:48:31 CEST +02:00],
["oaxej30u9we17onlug4orw", Sun, 23 Apr 2017 01:46:48 CEST +02:00],
["oaxej30u9we17onlug4orw", Sun, 23 Apr 2017 02:06:56 CEST +02:00],
["rqjwg1ka43mvri0dmrdxvg", Sun, 23 Apr 2017 17:23:34 CEST +02:00],
["ok8nq6tg-kor9jglsuhoyw", Tue, 25 Apr 2017 13:02:16 CEST +02:00],
["riwfm0m-0rmbb6e9kyug2g", Sat, 06 May 2017 06:12:27 CEST +02:00],
["riwfm0m-0rmbb6e9kyug2g", Sat, 06 May 2017 06:17:01 CEST +02:00],
["riwfm0m-0rmbb6e9kyug2g", Sat, 06 May 2017 06:18:04 CEST +02:00],
["gbqfn3_d_tritqoey5khjw", Sat, 06 May 2017 14:14:55 CEST +02:00],
["j___x1oap-veh0u1fo_oua", Sun, 07 May 2017 14:22:37 CEST +02:00],
...
我收到了ActiveRecord的这份清单。
MyModel.all.pluck(:token, :created_at)
该模型包含一些uniq标记和一些重复项。 重复是有趣的。
我想按键对时间戳进行分组,并查找每个键的第一个和最后一个时间戳。 所以我将数组分组如下:
grp = arr.group_by { |key, ts| key}
现在我收到一个这样的列表:
"vwfv8n5obwqmaw8r9fj-yq"=>[
["vwfv8n5obwqmaw8r9fj-yq", Thu, 11 May 2017 10:24:42 CEST +02:00]
],
"kacec6ybetpjdzlfgnnxya"=> [
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 22:00:31 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 22:01:43 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:58:17 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:59:05 CEST +02:00],
["kacec6ybetpjdzlfgnnxya", Fri, 12 May 2017 21:59:59 CEST +02:00]
],
...
是否可以对日期进行排序以轻松获得第一个和最后一个日期? 我太复杂了吗?我认为应该有一种更简单的方法来处理原始数据。
答案 0 :(得分:1)
获取一个散列,其中令牌为密钥,时间戳为值:
# this gives the same MIN and MAX if there is only one created_at in the group
rows = MyModel.group(:token)
.pluck("token, MIN(created_at), MAX(created_at)")
# loop though rows and create a hash
rows.each_with_object({}) do |(token, *t), hash|
hash[token] = t.uniq # removes dupes
end
{
"rqjwg1ka43mvri0dmrdxvg"=>[2017-04-23 15:23:34 UTC],
"riwfm0m-0rmbb6e9kyug2g"=>[2017-05-06 04:12:27 UTC, 2017-05-06 04:18:04 UTC]
# ...
}
如果您只是在查找具有重复项的记录,您可以使用WHERE子句来计算记录:
MyModel.where("(SELECT COUNT(*) FROM things t WHERE t.token = things.token) > 1")
答案 1 :(得分:0)
尝试这样的事情:
MyModel.order(:created_at).pluck(:token, :created_at).group_by { |key, ts| key }.flat_map{ |k, v| { k => [v.first, v.last] } }
答案 2 :(得分:0)
你可以这样做:
# you already have this bit
grp = arr.group_by { |key, ts| key}
# get the minmax values for each group
grp.map { |k, values_array| { k => values_array.minmax } }.reduce Hash.new, :merge
这应该会产生一些看起来像:
{
"vwfv8n5obwqmaw8r9fj-yq"=>[
[Thu, 11 May 2017 10:24:42 CEST +02:00, Thu, 11 May 2017 10:24:42 CEST +02:00]
],
"kacec6ybetpjdzlfgnnxya"=> [
[Fri, 12 May 2017 21:58:17 CEST +02:00, Fri, 12 May 2017 22:01:43 CEST +02:00]
],
...
}