Question

我有这样的数据：

Date             User ID
2012-10-11         a
2012-10-11         b
2012-10-12         c
2012-10-12         d 
2012-10-13         e
2012-10-14         b
2012-10-14         e

我想要做的是按每天最近两天的范围（在我的真实查询中，它将是7天）进行分组，并获得不同用户ID的计数。

例如，我希望结果如下所示：

Date             count(distinct userIDs)
2012-10-12         4
2012-10-13         3
2012-10-14         2

例如2012-10-12，我得到4分，因为我有'a'，'b'，'c'< / em>，和'd'。 ' ==＆gt; 'a'和'b'来自前一天，而'c'和'd'来自从同一天开始，2012-10-12。

同样，对于2012-10-13，我正在考虑2012-10-13和2012-10-12，我得到'c'，'d'，和'e'。

日期列的数据类型是日期。我正在使用Teradata。

我一直在努力研究它，但找不到适用于我的情况的直截了当的答案。： - /抱歉，如果这是重复。非常感谢您的帮助。谢谢！

Answer 1

我对Teradata语法并不完全熟悉，因此我将使用redbrick向您展示逻辑。

select date, count(distinct userid) records
from yourtable
where date >= dateadd(day, -2, current_date)
group by date
order by date

编辑从此处开始

进一步审核后，如果您更换

where date >= dateadd(day, -2, current_date)

与

where date >= current_date - 2

那么你应该好好去。

Answer 2

要做你想做的事情，你实际上需要“乘以”数据，因为每行可以包含在最后聚合的两个日期中。

我认为最简单的方法是union all方法：

select date, count(distinct userId)
from ((select date, UserId
       from t
      ) union all
      (select date + 1, UserId     -- combine with yesterday's data
       from t
      )
     ) t
group by date;

因为您正在处理7天，所以这是另一种方法：

select (t.date + n), count(distinct t.UserId)
from t cross join
     (select 0 as n union all select 1 union all select 2 union all select 3 union all
      select 4 union all select 5 union all select 6
     ) n
group by t.date + n;

按日期范围分组（teradata）

2 个答案: