您好我有一个我一直在运行的SQL查询,但是我得到的数据太多了。
就上下文而言,我们在30个产品类别和50个子类别(亲子关系)中携带约3000个项目。我们在数千家商店销售它们,我们的数据库捕获每个商店每个产品的每周销售额。我们存储多年的数据。
目前我的查询返回所有记录,而我想根据最近52周的单位销售总和将其限制为前10个销售项目(我的where子句指定52周,但我需要每周我拉的细节。
SELECT
store.store_id,
store.sales_rep,
store.sales_rep_manager,
prod.category,
prod.sub_category,
prod.item,
sales.week_id,
sum(sales.units) as "UNITS SOLD",
sum(sale.dollars) as "DOLLARS SOLD"
...
GROUP BY
store.store_id,
store.sales_rep,
store.sales_rep_manager,
prod.category,
prod.sub_category,
prod.item,
sales.week_id,
ORDER BY
7 desc
我认为我应该使用TOP声明,但我所做的一切都是将整个拉动限制在整体前10位。
我希望看到的是基于所选日期范围的单位速度的前10项,但对于每个商店和&子类
商店1 组别 子组别 TOP SELLING项目#1 最畅销物品#2 TOP SELLING项目#3 ... 最畅销物品#10
现在我已经在Excel中连接了我的查询,并且我要求我的数据透视表只过滤前十项。
我对这个解决方案的问题是我为TON带来了比我需要的更多的数据,使文件反应迟钝,太大,并且还需要花费大量时间来完成查询。
答案 0 :(得分:1)
您可以非常轻松地将结果限制在结果集中的总销售额中:
with q as (<your query here>)
select q.*
from (select q.*, dense_rank() over (order by TotalUs) as rnk
from (select q.*,
sum("Units Sold") over (partition by prod.item) as TotalUS
from q
) q
) q
where rnk <= 10;
去年获得它有点棘手:
with q as (<your query here>)
select q.*
from (select q.*, dense_rank() over (order by TotalUs) as rnk
from (select q.*,
sum(last_52weeks) over (partition by prod.item) as TotalUS
from (select q.*,
(case when dense_rank() over (partition by item_id order by week_id desc) <= 52
then "Units Sold" else 0
end) as last_52weeks
from q
) q
) q
) q
where rnk <= 10;
答案 1 :(得分:0)
我经常尝试使用级联CTE来解决这些类型的问题。我注意到这种格式可以解决相当复杂的问题,同时仍然更具可读性。数据类型从一个CTE流向下一个CTE,然后在最后发出SELECT语句。
以下是我为此案例汇总的一个例子。数据本身有点值得怀疑,因为store_id
5实际上是唯一售出超过10件物品的人,但它实际上只是一个示范,所以希望你仍然可以得到全局。显然,您的数据结构完全不同,但您应该能够根据需要进行调整,以使其与您的实际设置一起使用:
--===================================================================
-- Create and populate a table for demonstration purposes only:
--===================================================================
IF OBJECT_ID('tempdb..#Sales') IS NOT NULL DROP TABLE #Sales;
CREATE TABLE #Sales (
item_id INT,
category VARCHAR(10),
sub_category VARCHAR(10),
store_id INT,
week_id INT,
units INT,
dollars MONEY
);
INSERT INTO #Sales
VALUES (1, 'A', 'A1', 1, 1, 10, 50),
(1, 'A', 'A1', 2, 1, 10, 50),
(1, 'A', 'A1', 3, 1, 10, 50),
(1, 'A', 'A1', 4, 1, 10, 50),
(1, 'A', 'A1', 5, 1, 20, 50),
(2, 'B', 'B1', 1, 1, 20, 50),
(2, 'B', 'B1', 2, 1, 20, 50),
(2, 'B', 'B1', 3, 1, 20, 50),
(2, 'B', 'B1', 4, 1, 20, 50),
(2, 'B', 'B1', 5, 1, 20, 50),
(3, 'A', 'A1', 5, 1, 40, 50),
(4, 'A', 'A1', 5, 1, 10, 50),
(5, 'A', 'A1', 5, 1, 5, 50),
(6, 'A', 'A1', 5, 1, 100, 50),
(7, 'A', 'A1', 5, 1, 95, 50),
(8, 'A', 'A1', 5, 1, 35, 50),
(9, 'A', 'A1', 5, 1, 15, 50),
(10, 'A', 'A1', 5, 1, 11, 50),
(11, 'A', 'A1', 5, 1, 12, 50),
(12, 'A', 'A1', 5, 1, 49, 50),
(12, 'A', 'A1', 5, 1, 150, 50);
--===================================================================
-- The actual query starts here:
-- (note that the following is a single statement)
--===================================================================
WITH AggregatedSales AS (
-- This CTE will give you the totals for each store, for each item and category / sub-category:
SELECT
s.store_id,
s.category,
s.sub_category,
s.item_id,
--s.week_id, -- If you want to see the combined data for the entire date range, don't include week here
SUM(s.units) [total_units_sold],
SUM(s.dollars) [total_dollars_sold]
FROM #Sales s
WHERE s.week_id BETWEEN 1 AND 52 -- Adjust these to match your actual range
GROUP BY
s.store_id,
s.category,
s.sub_category,
s.item_id
--s.week_id
),
RankedSales AS (
-- This will assign a ranking to each of the records from the previous CTE.
-- The ranking is reset for each store, and ranks higher number of units sold toward the top.
SELECT
a.*,
DENSE_RANK() OVER (
PARTITION BY a.store_id
ORDER BY a.total_units_sold DESC
) [ranking]
FROM AggregatedSales a
)
-- Now we just select all of the "TOP 10" ranked items here:
-- (the WHERE clause is doing all the work in this case, so we don't need an actual TOP)
SELECT
rs.*
FROM RankedSales rs
WHERE rs.ranking <= 10
ORDER BY
rs.store_id,
rs.ranking;