所以,首先,这是我们在2010年找到最高消费者的前20%:
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2010-01-01'
and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc
由此,我需要找到未来两年内消费税前20%的保留率。
意思是, 2010年排名前20%的哪一位在2011年保持领先,然后在2012年排名第一?注意:我需要计算2010年的数量,然后在2011年,然后在2012年。
我知道如果我能创建另一张桌子或从excel表中拉出来,只列出顶级买家会更容易。但是,我没有对数据库的写访问权限,因此我必须在嵌套查询中执行所有操作,或者无论您是否需要建议。我还是初学者,所以我不知道最好的方法。
谢谢!
答案 0 :(得分:4)
你有一个有趣的问题。从根本上说,它是关于从一年到下一年消费五分之一的移民。我会通过查看三年的所有五分之一来解决这个问题,看看人们在哪里移动。
首先是按年份和电子邮件的数据摘要。关键功能是ntile()
。说实话,我经常使用row_number()
和count()
进行计算,这就是为什么那些在CTE中(但后来不再使用):
with YearSummary as (
select year(OrderDate) as yr, o.BillEmail, SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders,
row_number() over (partition by year(OrderDate) order by sum(o.Total) desc) as seqnum,
count(*) over (partition by year(OrderDate)) as NumInYear,
ntile(5) over (partition by year(OrderDate) order by sum(o.Total) desc) as Quintile
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13 and o.BillEmail not like ''
group by o.BillEmail, year(OrderDate)
)
select q2010, q2011, q2012,
count(*) as NumEmails,
min(BillEmail), max(BillEmail)
from (select BillEmail,
max(case when yr = 2010 then Quintile end) as q2010,
max(case when yr = 2011 then Quintile end) as q2011,
max(case when yr = 2012 then Quintile end) as q2012
from YearSummary
group by BillEmail
) ys
group by q2010, q2011, q2012
order by 1, 2, 3;
最后一步是为每封电子邮件获取多行,并将它们合并为计数。请注意,某些电子邮件在某些年份不会有任何支出,因此它们对应的Quintile
将为NULL(这实际上应该产生更多像180行 - 5 * 6 * 6 - 而不是125行 - 5 * 5 * 5
我还在最终结果中添加了示例电子邮件(min()
和max()
),以便您查看每个组的样本。
注意:对于保留率,计算所有年份(1,1,1) - 顶部瓷砖之间的比率 - 以及2010年最高分位数的总和。
答案 1 :(得分:0)
试试这个:
;WITH top_2010 AS
(
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2010-01-01'
and o.OrderDate < '2011-01-01'
group by o.BillEmail
),
top_2011 AS
(
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2011-01-01'
and o.OrderDate < '2012-01-01'
group by o.BillEmail
),
top_2012 AS
(
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2012-01-01'
and o.OrderDate < '2013-01-01'
group by o.BillEmail
)
SELECT top_2010.*,
ISNULL(top_2011.TotalSpent, 0) AS [TotalSpent_2011],ISNULL(top_2011.TotalOrders, 0) AS [TotalOrders_2011] ,
ISNULL(top_2012.TotalSpent, 0) AS [TotalSpent_2012],ISNULL(top_2012.TotalOrders, 0) AS [TotalOrders_2012]
FROM top_2010
LEFT JOIN top_2011 ON top_2010.BillEmail = top_2011.BillEmail
LEFT JOIN top_2012 ON top_2010.BillEmail = top_2012.BillEmail
WHERE top_2011.BillEmail IS NOT NULL OR top_2012.BillEmail IS NOT NULL
order by top_2010.TotalSpent desc
请注意我正在使用LEFT JOIN
,因此您可以看到2011年或 2012
如果你需要2011年 AND 2012年的那些,你可以改为INNER JOIN
答案 2 :(得分:0)
您可以使用common table expressions来完成此操作。为每年创建一个前20%的上市,然后内部加入他们以找出哪些公司在所有三年中处于最高分位。因为您只想要在所有三年中出现的记录,所以不应使用左连接。
WITH Top2010 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2010-01-01'
and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2011 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2011-01-01'
and o.OrderDate < '2012-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2012 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2012-01-01'
and o.OrderDate < '2013-01-01'
group by o.BillEmail
order by TotalSpent desc
)
SELECT Top2010.BillEmail -- plus whatever other columns you want
FROM Top2010
INNER JOIN Top2011 ON Top2010.BillEmail = Top2011.BillEmail
INNER JOIN Top2012 ON Top2012.BillEmail = Top2011.BillEmail
答案 3 :(得分:0)
就个人而言,我会使用几个CTE;每年一个。我还会更一般地命名(而不是在任何地方嵌入年份名称)。获得结果集后,我们可以使用EXISTS
检查所有3个时段的人员。
-- Get the 1st Jan in the current year
DECLARE @current_year date = DateAdd(yy, DateDiff(yy, 0, Current_Timestamp), 0);
; WITH highest_spenders_2_years_ago AS (
<your_query>
WHERE o.orderDate >= DateAdd(yy, -2, @current_year)
AND o.orderDate < DateAdd(yy, -1, @current_year)
)
, highest_spenders_last_year AS (
<your_query>
WHERE o.orderDate >= DateAdd(yy, -1, @current_year)
AND o.orderDate < DateAdd(yy, 0, @current_year)
)
, highest_spenders_this_year AS (
<your_query>
WHERE o.orderDate >= DateAdd(yy, 0, @current_year)
AND o.orderDate < DateAdd(yy, 1, @current_year)
)
SELECT *
FROM highest_spenders_this_year
WHERE EXISTS (
SELECT *
FROM highest_spenders_last_year
WHERE BillEmail = highest_spenders_this_year.BillEmail
)
AND EXISTS (
SELECT *
FROM highest_spenders_2_years_ago
WHERE BillEmail = highest_spenders_this_year.BillEmail
)
答案 4 :(得分:0)
SELECT Base2010.BillEmail,
CASE WHEN retention2011.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2011,
CASE WHEN retention2012.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2012
FROM
(select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders, BillEmail
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2010-01-01'
and o.OrderDate < '2011-01-01'
group by o.BillEmail
) AS Base2010
LEFT JOIN
(select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders, BillEmail
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2011-01-01'
and o.OrderDate < '2012-01-01'
group by o.BillEmail
) AS retention2011
ON Base2010.BillEmail = retention2011.BillEmail
LEFT JOIN
(select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
count(o.OrderID) as TotalOrders, BillEmail
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
and o.BillEmail not like ''
and o.OrderDate >= '2012-01-01'
and o.OrderDate < '2013-01-01'
group by o.BillEmail
) AS retention2012
ON Base2010.BillEmail = retention2012.BillEmail