Question

我很难想出一个为csv编译一些统计数据的好方法。

我有一个stats表，其中包含一个session_id列和一个created_at列以及一些belongs_to的其他关联。

我想要实现的是格式化得很好的结果，它会计算唯一的session_ids（有时会有重复的会话。我不需要计算这些数据）然后将这些数字分组到出现的那一小时。

目前如果我这样做：

Stat.group("date_format(created_at, '%H')").count

它返回此

=> {"00"=>100, "01"=>77, "02"=>80, "03"=>88, "04"=>96, "05"=>98, "06"=>104, "07"=>87, "08"=>80, "09"=>67, "10"=>92, "11"=>67, "12"=>83, "13"=>91, "14"=>72, "15"=>64, "16"=>61, "17"=>169, "18"=>90, "19"=>83, "20"=>119, "21"=>94, "22"=>95, "23"=>82}

非常适合返回所有结果 - 我确实需要。

但我需要做的是弄清楚如何组合这样的东西 - 只返回唯一的会话

Stat.select(:session_id).map(&:session_id).uniq

我在mysql中乱搞并想出了这个 - 它看起来像我需要的那样。但是我无法想象如何让它与活跃的记录一起发挥出色。

SELECT COUNT(*) AS count_all, date_format(created_at, '%H') AS date_format_created_at_h,COUNT(DISTINCT session_id) AS session FROM my_db.stats GROUP BY date_format(created_at, '%H')

任何人都可以了解如何实现这一目标吗？

提前谢谢。

Answer 1

您的查询：

2.0.0-p195 :156 > Stat.group("strftime('%S',created_at)").count
(3.4ms)  SELECT COUNT(*) AS count_all, strftime('%S',created_at) AS strftime_s_created_at FROM "stats" GROUP BY strftime('%S',created_at)
=> {"24"=>105, "25"=>80, "26"=>88, "27"=>83, "28"=>86, "29"=>80, "30"=>84, "31"=>70, "32"=>68, "33"=>73, "34"=>123, "35"=>84, "36"=>74, "37"=>59, "38"=>80, "39"=>77, "40"=>82, "41"=>79, "42"=>88, "43"=>73, "44"=>64, "45"=>82, "46"=>86, "47"=>87, "48"=>37}

使用select COUNT(DISTINCT(session_id))获取uniq计数：

2.0.0-p195 :157 > Stat.select("strftime('%S',created_at) as time").select('COUNT(DISTINCT(session_id)) as uniq').group("strftime('%S',created_at)").map{|x| {x.time => x.uniq}}.reduce(:merge)
Stat Load (5.4ms)  SELECT strftime('%S',created_at) as time, COUNT(DISTINCT(session_id)) as uniq FROM "stats" GROUP BY strftime('%S',created_at)
=> {"24"=>44, "25"=>53, "26"=>52, "27"=>59, "28"=>61, "29"=>61, "30"=>64, "31"=>52, "32"=>49, "33"=>57, "34"=>102, "35"=>69, "36"=>55, "37"=>42, "38"=>65, "39"=>59, "40"=>65, "41"=>63, "42"=>69, "43"=>56, "44"=>47, "45"=>67, "46"=>70, "47"=>69, "48"=>22}

统计模型。获取唯一的会话ID，然后按小时分组

1 个答案: