最高20%Spender的保留率

时间:2013-08-05 15:12:18

标签: sql sql-server sql-server-2008

所以,首先,这是我们在2010年找到最高消费者的前20%:

select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2010-01-01'
    and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc

由此,我需要找到未来两年内消费税前20%的保留率。

意思是, 2010年排名前20%的哪一位在2011年保持领先,然后在2012年排名第一?注意:我需要计算2010年的数量,然后在2011年,然后在2012年。

我知道如果我能创建另一张桌子或从excel表中拉出来,只列出顶级买家会更容易。但是,我没有对数据库的写访问权限,因此我必须在嵌套查询中执行所有操作,或者无论您是否需要建议。我还是初学者,所以我不知道最好的方法。

谢谢!

5 个答案:

答案 0 :(得分:4)

你有一个有趣的问题。从根本上说,它是关于从一年到下一年消费五分之一的移民。我会通过查看三年的所有五分之一来解决这个问题,看看人们在哪里移动。

首先是按年份和电子邮件的数据摘要。关键功能是ntile()。说实话,我经常使用row_number()count()进行计算,这就是为什么那些在CTE中(但后来不再使用):

with YearSummary as (
      select year(OrderDate) as yr, o.BillEmail, SUM(o.total) as TotalSpent,
             count(o.OrderID) as TotalOrders,
             row_number() over (partition by year(OrderDate) order by sum(o.Total) desc) as seqnum,
             count(*) over (partition by year(OrderDate)) as NumInYear,
             ntile(5) over (partition by year(OrderDate) order by sum(o.Total) desc) as Quintile
      from dbo.tblOrder o with (nolock)
      where o.DomainProjectID=13 and o.BillEmail not like ''
      group by o.BillEmail, year(OrderDate)
     )
select q2010, q2011, q2012,
       count(*) as NumEmails,
       min(BillEmail), max(BillEmail)
from (select BillEmail,
             max(case when yr = 2010 then Quintile end) as q2010,
             max(case when yr = 2011 then Quintile end) as q2011,
             max(case when yr = 2012 then Quintile end) as q2012
      from YearSummary
      group by BillEmail
     ) ys
group by q2010, q2011, q2012
order by 1, 2, 3;

最后一步是为每封电子邮件获取多行,并将它们合并为计数。请注意,某些电子邮件在某些年份不会有任何支出,因此它们对应的Quintile将为NULL(这实际上应该产生更多像180行 - 5 * 6 * 6 - 而不是125行 - 5 * 5 * 5

我还在最终结果中添加了示例电子邮件(min()max()),以便您查看每个组的样本。

注意:对于保留率,计算所有年份(1,1,1) - 顶部瓷砖之间的比率 - 以及2010年最高分位数的总和。

答案 1 :(得分:0)

试试这个:

;WITH top_2010 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2010-01-01'
        and o.OrderDate < '2011-01-01'
    group by o.BillEmail
), 
top_2011 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2011-01-01'
        and o.OrderDate < '2012-01-01'
    group by o.BillEmail
), 
top_2012 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2012-01-01'
        and o.OrderDate < '2013-01-01'
    group by o.BillEmail
)
SELECT top_2010.*, 
    ISNULL(top_2011.TotalSpent, 0) AS [TotalSpent_2011],ISNULL(top_2011.TotalOrders, 0) AS [TotalOrders_2011] ,
    ISNULL(top_2012.TotalSpent, 0) AS [TotalSpent_2012],ISNULL(top_2012.TotalOrders, 0) AS [TotalOrders_2012]
FROM top_2010
LEFT JOIN top_2011 ON top_2010.BillEmail = top_2011.BillEmail 
LEFT JOIN top_2012 ON top_2010.BillEmail = top_2012.BillEmail 
WHERE top_2011.BillEmail IS NOT NULL OR top_2012.BillEmail IS NOT NULL
order by top_2010.TotalSpent desc

请注意我正在使用LEFT JOIN,因此您可以看到2011年 2012

中的所有人

如果你需要2011年 AND 2012年的那些,你可以改为INNER JOIN

答案 2 :(得分:0)

您可以使用common table expressions来完成此操作。为每年创建一个前20%的上市,然后内部加入他们以找出哪些公司在所有三年中处于最高分位。因为您只想要在所有三年中出现的记录,所以不应使用左连接。

WITH Top2010 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2010-01-01'
    and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2011 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2011-01-01'
    and o.OrderDate < '2012-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2012 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2012-01-01'
    and o.OrderDate < '2013-01-01'
group by o.BillEmail
order by TotalSpent desc
)

SELECT Top2010.BillEmail -- plus whatever other columns you want
FROM Top2010
INNER JOIN Top2011 ON Top2010.BillEmail = Top2011.BillEmail
INNER JOIN Top2012 ON Top2012.BillEmail = Top2011.BillEmail

答案 3 :(得分:0)

就个人而言,我会使用几个CTE;每年一个。我还会更一般地命名(而不是在任何地方嵌入年份名称)。获得结果集后,我们可以使用EXISTS检查所有3个时段的人员。

-- Get the 1st Jan in the  current year
DECLARE @current_year date = DateAdd(yy, DateDiff(yy, 0, Current_Timestamp), 0);

; WITH highest_spenders_2_years_ago AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy, -2, @current_year)
  AND    o.orderDate <  DateAdd(yy, -1, @current_year)
)
, highest_spenders_last_year AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy, -1, @current_year)
  AND    o.orderDate <  DateAdd(yy,  0, @current_year)
)
, highest_spenders_this_year AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy,  0, @current_year)
  AND    o.orderDate <  DateAdd(yy,  1, @current_year)
)
SELECT *
FROM   highest_spenders_this_year
WHERE  EXISTS (
         SELECT *
         FROM   highest_spenders_last_year
         WHERE  BillEmail = highest_spenders_this_year.BillEmail
       )
AND    EXISTS (
         SELECT *
         FROM   highest_spenders_2_years_ago
         WHERE  BillEmail = highest_spenders_this_year.BillEmail
       )

答案 4 :(得分:0)

SELECT Base2010.BillEmail, 
CASE WHEN retention2011.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2011, 
CASE WHEN retention2012.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2012
FROM 
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2010-01-01'
        and o.OrderDate < '2011-01-01'
    group by o.BillEmail
    ) AS Base2010

LEFT JOIN
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail 
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2011-01-01'
        and o.OrderDate < '2012-01-01'
    group by o.BillEmail
    ) AS retention2011
ON Base2010.BillEmail = retention2011.BillEmail

LEFT JOIN
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2012-01-01'
        and o.OrderDate < '2013-01-01'
    group by o.BillEmail
    ) AS retention2012
ON Base2010.BillEmail = retention2012.BillEmail