计算分组列并通过其他列对它们进行分组

时间:2016-11-21 09:10:42

标签: sql sql-server

我有点问题。 数据:

2016-11-09  0536B088-D3DE-4C0E-903F-C2463D0AAB7E
2016-11-09  866D70EC-93FD-4C30-BC54-C7B954F255BE
2016-11-09  6C090D6B-9842-4CB0-9E10-F9B941C8D3A1
2016-11-09  FB1DD63E-F098-4191-B8F4-BEA4F9776B54
2016-11-09  FB1DD63E-F098-4191-B8F4-BEA4F9776B54
2016-11-10  0536B088-D3DE-4C0E-903F-C2463D0AAB7E
2016-11-10  NULL
2016-11-10  0536B088-D3DE-4C0E-903F-C2463D0AAB7E
2016-11-11  0536B088-D3DE-4C0E-903F-C2463D0AAB7E
2016-11-11  0536B088-D3DE-4C0E-903F-C2463D0AAB7E

从中我想通过Date计算UserId和group。 我应该是这样的:

Date | Unique | Returning | New
..09  | 4      | 1         | 3
..10  | 2      | 1         | 1
..11  | 1      | 1         | 0

我怎么做? 我有这个问题。

select 
    cast(EventTime as date) as 'Date', 
    count(distinct UserId) + count(distinct case when UserId is null then 1 end) as 'Unique users',
    0 as 'Returning users',
    0 as 'New users'
from 
    TelemetryData 
where 
    DiscountId = '5F8851DD-DF77-46DC-885E-46ECA93F021C' and EventName = 'DiscountClick'
group by 
    cast(EventTime as date)`

唯一用户=唯一,也是NULL!

提醒用户=点击次数超过1次的用户ID isnull(sum(case when UserId(here shoudld be count) > 1 then 1 else 0 end), 1)

只点击一个用户的新用户! isnull(sum(case when UserId(count also) = 1 then 1 else 0 end), 1)

@EDIT: 好的,你的两个结果很完美。但我现在需要将它与其他查询集成。 SELECT '5F8851DD-DF77-46DC-885E-46ECA93F021C', cast([dbo].[TelemetryData].[EventTime] as date) as 'Date', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountLike' then 1 else 0 end) as 'Likes', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountDislike' then 1 else 0 end) as 'Dis likes', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountSharing' then 1 else 0 end) as 'Shares', SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) as 'Views', SUM(case when [dbo].[TelemetryData].[EventName]='DiscountClick' then 1 else 0 end) as 'Clicks', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCode' then 1 else 0 end) as 'Downloaded codes', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountSave' then 1 else 0 end) as 'Saves', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountClickWWW' then 1 else 0 end) as 'Page redirections', Round( cast(Sum(case when [dbo].[TelemetryData].[EventName]='DiscountClick' then 1 else 0 end) as float) / cast( case when SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) = 0 then 1 else SUM(case when [dbo].[TelemetryData].[EventName]='DiscountView' then 1 else 0 end) end as float) * 100, 2) as 'Average CTR', 0 as 'Unique users', 0 as 'New users', 0 as 'Returning users', Sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCommentPositive' then 1 else 0 end) as 'Positive comments', sum(case when [dbo].[TelemetryData].[EventName] = 'DiscountCommentNegative' then 1 else 0 end) as 'Negative comments' from [dbo].[TelemetryData] where [dbo].[TelemetryData].[DiscountId] = '5F8851DD-DF77-46DC-885E-46ECA93F021C' and ([dbo].[TelemetryData].[EventName] = 'DiscountView' or [dbo].[TelemetryData].[EventName] = 'DiscountClick' or [dbo].[TelemetryData].[EventName] = 'DiscountDislike' or [dbo].[TelemetryData].[EventName] = 'DiscountCode' or [dbo].[TelemetryData].[EventName] = 'DiscountLike' or [dbo].[TelemetryData].[EventName] = 'DiscountSharing' or [dbo].[TelemetryData].[EventName] = 'DiscountClickWWW' or [dbo].[TelemetryData].[EventName] = 'DiscountSave' or [dbo].[TelemetryData].[EventName] = 'DiscountCommentPositive' or [dbo].[TelemetryData].[EventName] = 'DiscountCommentNegative') group by cast([dbo].[TelemetryData].[EventTime] as date) order by cast([dbo].[TelemetryData].[EventTime] as date) asc

现在很难......

5 个答案:

答案 0 :(得分:1)

您希望结果中包含汇总的用户信息。一个明显而简单的解决方案是按日期和用户优先分组,以便按用户和日期获取此信息,并且仅在以后按日期分组。

select 
  eventdate,
  count(*) as unique_users,
  count(case when cnt > 1 then 1 end) as returning_users,
  count(case when cnt = 1 then 1 end) as new_users
from
(
  select cast(eventtime as date) as eventdate, userid, count(*) as cnt
  from telemetrydata
  where ...
  group by cast(eventtime as date), userid
) date_user
group by eventdate;

答案 1 :(得分:0)

可能我不明白你的问题,但查看你的数据似乎需要

select 
     date
   , count(*) as unique
   , (count(*) - count(distinct user_id))  as returning
   ,  count(distinct user_id) as new

group by date 
were user_id is not null

答案 2 :(得分:0)

尝试以下查询

select Date, uniques, returning, uniques-returning as new
from (    
    select Date, 
           sum(case when row_num = 1 then 1 else 0 end) uniques, 
           sum(case when row_num = 2 then 1 else 0 end) returning
    from(    
        select cast(EventTime as date) as Date, 
               ROW_NUMBER() over(partition by EventTime, userid order by EventTime) row_num
        from TelemetryData) cte1    
    group by Date)cte2

希望这可以帮到你

答案 3 :(得分:0)

使用公用表表达式尝试:

<强>设置

CREATE TABLE #TelemetryData
(
   EventTime  Date,
   UserId UNIQUEIDENTIFIER NULL
   )


INSERT INTO #TelemetryData
VALUES
('2016-11-09', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'),
('2016-11-09', '866D70EC-93FD-4C30-BC54-C7B954F255BE'),
('2016-11-09', '6C090D6B-9842-4CB0-9E10-F9B941C8D3A1'),
('2016-11-09', 'FB1DD63E-F098-4191-B8F4-BEA4F9776B54'),
('2016-11-09', 'FB1DD63E-F098-4191-B8F4-BEA4F9776B54'),
('2016-11-10', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'),
('2016-11-10',  NULL),
('2016-11-10', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'),
('2016-11-11', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E'),
('2016-11-11', '0536B088-D3DE-4C0E-903F-C2463D0AAB7E')

<强>查询

;WITH CTE
AS
(
    SELECT EventTime, 
           UserId, 
           COUNT(*) cnt, 
           ROW_NUMBER() OVER (PARTITION BY EventTime ORDER BY EventTime) RN
    FROM #TelemetryData
    GROUP BY EventTime, UserId
)

SELECT EventTime, 
       MAX(RN) AS [Unique],
       SUM(CASE WHEN cnt > 1 THEN 1 ELSE 0 END) as New, 
       SUM(CASE WHEN cnt = 1 THEN 1 ELSE 0 END) AS Returning
FROM CTE
GROUP BY EventTime

<强>结果

EventTime   Unique  New Returning
2016-11-09  4       1   3
2016-11-10  2       1   1
2016-11-11  1       1   0

答案 4 :(得分:0)

以下查询应该有效:

select EventTime,
    max(DistinctRank) [Unique],
    sum(CountOfDistinct - 1) Returning,
    max(DistinctRank) - sum(CountOfDistinct - 1) New
from
    (select distinct EventTime,
        UserId,
        rank() over (partition by EventTime order by UserId) DistinctRank,
        count(1) over (partition by EventTime, UserId) CountOfDistinct
    from TelemetryData) sub
group by EventTime

子查询(单独运行并亲自查看)将返回EventTime和UserID的唯一组合,以及给定日期的每个唯一UserId的排名,以及EventTime和每个组合的不同值的计数用户ID:

EventDate               UserId                               DistinctRank  CountOfDistinct
2016-11-09 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E    1             1
2016-11-09 00:00:00.000 6C090D6B-9842-4CB0-9E10-F9B941C8D3A1    2             1
2016-11-09 00:00:00.000 866D70EC-93FD-4C30-BC54-C7B954F255BE    3             1
2016-11-09 00:00:00.000 FB1DD63E-F098-4191-B8F4-BEA4F9776B54    4             2
2016-11-10 00:00:00.000 NULL                                    1             1
2016-11-10 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E    2             2
2016-11-11 00:00:00.000 0536B088-D3DE-4C0E-903F-C2463D0AAB7E    1             2

然后外部查询获得每个唯一对的最大DistinctRank,它是EventDate的唯一UserId的数量,实质上是给定EventDate的UserId中存在重复的子查询记录的总和,即数字返回用户New列只是Unique和Returning之间的区别。结果是:

Event Date                Unique  Returning  New
2016-11-09 00:00:00.000   4       1          3
2016-11-10 00:00:00.000   2       1          1
2016-11-11 00:00:00.000   1       1          0