我正在编写一个简单的数据仓库,它允许我查询表格以观察数据的周期性(比如每周)变化,以及数据变化的变化(例如每周销售额的每周变化) )。
为了简单起见,我将介绍我在这里使用的表格的非常简化(几乎无足轻重)版本。销售数据表是一个视图,具有以下结构:
CREATE TABLE sales_data (
sales_time date NOT NULL,
sales_amt double NOT NULL
)
出于这个问题的目的。我遗漏了你期望看到的其他字段 - 比如product_id,sales_person_id等等,因为它们与这个问题没有直接关系。 AFAICT,将在查询中使用的唯一字段是sales_time和sales_amt字段(除非我弄错了)。
我还有一个日期维度表,其结构如下:
CREATE TABLE date_dimension (
id integer NOT NULL,
datestamp date NOT NULL,
day_part integer NOT NULL,
week_part integer NOT NULL,
month_part integer NOT NULL,
qtr_part integer NOT NULL,
year_part integer NOT NULL,
);
哪个分区的日期为报告范围。
我需要编写允许我执行以下操作的查询:
返回更改 <周> sales_amt指定期间内的更改。例如,今日销售与销售N天之间的变化 - 其中N是正整数(在这种情况下N == 7)。
在指定期间内返回sales_amt的更改更改。对于(1)。我们计算了一周的变化。现在我们想知道这种变化是如何不同的 上周计算的(一周一周)变化。
此时我陷入困境,因为SQL是我最弱的技能。如果SQL master可以解释如何以DB不可知的方式编写这些查询(即使用ANSI SQL),我将不胜感激。
答案 0 :(得分:5)
如上面的评论中所述,我可能不了解您的模型 - 所以这是一个简单的入门。
现在,如果我想要在2010日历年的每周销售
select
CalendarYearWeek
, sum(SalesAmount)
from factSales as f
join dimDate as d on d.DateKey = f.DateKey
where Year = 2010
group by CalendarYearWeek
CalendarYearWeek
是dimDate中的列,varchar(8),例如'2010-w03',Year
也是dimDate中的整数列。
不确定这是否接近你所寻找的,但可能是一个开始。
修改强>
dimDate 也有以下列:
WeekNumberInEpoch ,整数 - 从过去的某个纪元日期开始增加。同一周内dimDate中的所有行都具有相同的WeekNumberInEpoch。
DayOfWeek ,varchar(10) - 'sunday','monday',...
DayNumberInWeek ,整数 - 1-7
这使用CTE,应该与最新的PostgreSQL,SQL Server,Oracle,DB2一起使用。对于其他人,您可以将CTE(q_00)打包成子查询。
-- for week to previous week
with
q_00 as (
select
WeekNumberInEpoch
, sum(SalesAmount) as Amount
from factSale as f
join dimDate as d on d.DateKey = f.DateKey
where CalendarYear = 2010
group by WeekNumberInEpoch
)
select
a.WeekNumberInEpoch
, a.Amount as ThisWeekSales
, b.Amount as LastWeekSales
, a.Amount - b.Amount as Difference
from q_00 as a
join q_00 as b on b.WeekNumberInEpoch = a.WeekNumberInEpoch - 1
order by a.WeekNumberInEpoch desc ;
-- for day of week to day of previous week
-- monday to monday, tuesday to tuesday, ...
with
q_00 as (
select
WeekNumberInEpoch
, DayOfWeek
, sum(SalesAmount) as Amount
from factSale as f
join dimDate as d on d.DateKey = f.DateKey
where CalendarYear = 2010
group by WeekNumberInEpoch, DayOfWeek
)
select
a.WeekNumberInEpoch
, a.DayOfWeek
, a.Amount as ThisWeekSales
, b.Amount as LastWeekSales
, a.Amount - b.Amount as Difference
from q_00 as a
join q_00 as b on (b.WeekNumberInEpoch = a.WeekNumberInEpoch - 1
and b.DayOfWeek = a.DayOfWeek)
order by a.WeekNumberInEpoch desc, a.DayOfWeek ;
-- Sliding by day and day difference (= 7)
with
q_00 as (
select
DayNumberInEpoch
, FullDate
, DayOfWeek
, sum(SalesAmount) as Amount
from factSale as f
join dimDate as d on d.DateKey = f.DateKey
where CalendarYear = 2010
group by DayNumberInEpoch, FullDate, DayOfWeek
)
select
a.FullDate as ThisDay
, a.DayOfWeek as ThisDayName
, a.Amount as ThisDaySales
, b.FullDate as PreviousPeriodDay
, b.DayOfWeek as PreviousDayName
, b.Amount as PreviousPeriodDaySales
, a.Amount - b.Amount as Difference
from q_00 as a
join q_00 as b on b.DayNumberInEpoch = a.DayNumberInEpoch - 7
order by a.FullDate desc ;
答案 1 :(得分:2)
我建议您为“时间”构建一个单独的维度表(每行一天,其中包含有关重复时间段(日,周,月,季度)的信息,以便您可以轻松加入/选择该类型的信息。
您对(1.)和(2.)的查询可以这样构建。
是的,大多数SQL方言允许使用时间/日期函数推断该信息..但它们比使用维度表更慢(-er)并且更复杂。