我创建了一个查询,该查询返回一个计数,该计数对最近31天(基于timestamp
字段)的行数(记录)进行计数,并且还包括该时间段之前的31天。产生一个返回两者的查询。我现在有以下查询:
SELECT
COUNT(*) OVER(ORDER BY datetime DESC RANGE BETWEEN 2678400000 PRECEDING AND CURRENT ROW) AS rolling_avg_31_days,
COUNT(*) OVER(ORDER BY datetime DESC RANGE BETWEEN 5356800000 PRECEDING AND CURRENT ROW) AS rolling_avg_62_days
FROM `p`
ORDER BY rolling_avg_31_days DESC LIMIT 1
它返回一些数据,但不是我真正希望的数据:
rolling_avg_31_days | rolling_avg_62_days
8,422,783 | 9,790,304
如果我查询同一表(滚动62天):
SELECT COUNT(*) FROM `p`
WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-5356800000 AND datetime < UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000'
我得到的值为6,192,920
我不确定自己在做什么错。任何帮助深表感谢!
答案 0 :(得分:1)
因此,第一个查询是正确的,并且根据时间戳字段为您提供滚动计数(31天和62天)-也是因为order by .. desc
和limit 1
拥有最多rolling_avg_31_days
的最大行,而不一定是最近()日期时间的行
第二个查询仅基于当前时间戳产生62到31天之间的行数-就像上面所解释的,这不是第一个查询产生的内容-因此差异
要进一步troubleshoot
或尝试理解差异,请将ORDER BY rolling_avg_31_days DESC LIMIT 1
更改为ORDER BY datetime DESC LIMIT 1
,并添加datetime
以选择语句,以便查看它是否属于当前日期或接近当前陈述,因此结果具有可比性
答案 1 :(得分:0)
与上述内容无关,我决定将查询更改为更简单:
SELECT
(SELECT COUNT(DISTINCT(wasabi_user_id)) FROM `p` WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-5356800000 AND datetime < UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000) as _62days,
(SELECT COUNT(DISTINCT(wasabi_user_id)) FROM `p` WHERE datetime > UNIX_MILLIS(CURRENT_TIMESTAMP)-2678400000) AS _31days
FROM `mycujoo_kafka_public.v_web_event_pageviews` LIMIT 1
尽管如此,感谢@Mikhail!