Oracle分析查询

时间:2015-05-13 05:34:43

标签: sql oracle analytics

我有这三个表:

create table CUSTOMER(
                id integer not null primary key,
                name varchar(255) not null
);

create table PRODUCT(
                id integer not null primary key,
                name varchar(255)  not null
);

create table INVOICE(
                invoice_number varchar(20) not null primary key,
                invoice_date date not null,
                customer_id integer not null,
                product_id integer not null,
                quantity integer,
                summa numeric(13,2)
);

ALTER TABLE invoice ADD CONSTRAINT FK_invoice_customer FOREIGN KEY (customer_id)
REFERENCES customer(id);
ALTER TABLE invoice ADD CONSTRAINT FK_invoice_product FOREIGN KEY (product_id)
REFERENCES product(id);

我需要针对这些查询执行两个查询:

查询一个:

在2015/01年度,2015/02年度,2015/03年度购买商品的客户名称(每月至少提及一次)但在2015/04年度未购买同一商品。

我尝试的是这样的:

SELECT * 
FROM invoice i, customer c, product p
 WHERE i.customer_id = c.id 
AND i.product_id = p.id
AND invoice_date BETWEEN '01/01/2015' AND '03/01/2015'
MINUS
SELECT * 
FROM invoice i, customer c, product p 
WHERE i.customer_id = c.id 
AND i.product_id = p.id AND invoice_date BETWEEN '01/04/2015' AND '04/30/2015';

尝试查找2015年1月1日至2015年1月3日期间购买物品的所有客户(使用的日期格式为mm / dd / yyyy)以及在04/01/2015和04之间购买了物品的客户/ 30/2015根据我的知识,这应该至少在一定程度上朝着正确的方向发展,但是像这样,我无法检查客户是否每月购买一件商品,或者在三个月内购买一件商品。

查询二:

如果找到具有相似行为的客户,在给定期间(例如一个月)购买相同数量的同一商品的客户,数量可能会有5%的差异(+ - 5%)。

谢谢大家。

2 个答案:

答案 0 :(得分:1)

这将返回在三个月内购买相同项目的客户,但不会返回第四个月:

SELECT *
FROM 
 ( SELECT customer_id, product_id
   FROM invoice
   WHERE invoice_date BETWEEN DATE '2015-01-01' AND DATE '2015-04-30' -- data for 4 months
   GROUP BY customer_id, product_id
   HAVING COUNT(DISTINCT EXTRACT(MONTH FROM invoice_date)) = 3        -- at least one per month
     AND MAX(invoice_date) < DATE '2015-04-01'                        -- none in april 
 ) i 
JOIN customer c 
  ON i.customer_id = c.id
JOIN product p
  ON i.product_id = p.id;

根据您的评论,这不是正确的答案。顾客可能在前三个月购买了任何物品组合,并且在第4个月内没有购买任何物品(但其他物品)。这应该返回正确的答案:

WITH cte AS 
 ( SELECT customer_id, product_id, 
      -- number of months with buys per customer
      COUNT(DISTINCT EXTRACT(MONTH FROM invoice_date))
      OVER (PARTITION BY customer_id) AS cnt
    FROM invoice
    WHERE invoice_date BETWEEN DATE '2015-01-01' AND DATE '2015-03-31'
 )
SELECT DISTINCT customer_id 
FROM cte
WHERE cnt = 3  -- at least one buy per month
AND NOT EXISTS -- product wasn't bought by customer in april
 ( SELECT * FROM invoice i
   WHERE i.invoice_date BETWEEN DATE '2015-04-01' AND DATE '2015-04-30'
   AND i.customer_id = cte.customer_id
   AND i.product_id = cte.product_id
 )

您可以使用EXTRACT(MONTH FROM invoice_date)代替TRUNC(invoice_date, 'mon',但我更喜欢标准SQL语法。

这将返回你的第二个结果:

WITH cte AS
 ( -- data from one month
   SELECT *
   FROM invoice 
   WHERE invoice_date BETWEEN DATE '2015-02-01' AND DATE '2015-02-28'
 )
SELECT DISTINCT t1.customer_id, t2.customer_id, t1.product_id -- need DISTINCT because there might be multiple rows per product/customer
FROM cte t1 JOIN cte t2
  ON t1.product_id = t2.product_id    -- same product
  AND t1.customer_id < t2.customer_id -- different customers
WHERE t1.quantity BETWEEN t2.quantity / 1.05 AND t2.quantity * 1.05

您需要将此结果添加回customerproduct以获取更多详细信息。

答案 1 :(得分:0)

您需要知道我们可以在Oracle日期使用TRUNC();根据格式掩码,我们可以获得一年的第一天或一个月的第一天。此查询使用这两个技巧生成月份列表并将其加入发票。它还使用客户的交叉连接来生成客户月份矩阵。

现在我们知道客户是否每个月都买了东西:

select months.mm
       , c.id as customer_id
       , nvl2(max(i.invoice_number), 'Y', 'N')  as bought_something
from ( select add_months(trunc(sysdate, 'yyyy'), level-1) as mm
                 from dual
                 connect by level <= 4 ) months
     cross join customer c
     left outer join invoice i
     on months.mm = trunc(i.invoice_date, 'MM')
        and c.id = i.customer_id
group by months.mm, c.id 

我们可以将此结果提供给另一个查询:

with mtrx as (
    select months.mm
           , c.id as customer_id
           , nvl2(max(i.invoice_number), 'Y', 'N')  as bought_something
    from ( select add_months(trunc(sysdate, 'yyyy'), level-1) as mm
                     from dual
                     connect by level <= 4 ) months
         cross join customer c
         left outer join invoice i
         on months.mm = trunc(i.invoice_date, 'MM')
            and c.id = i.customer_id
    group by months.mm, c.id 
    ) 
select customer_id from mtrx where mm = date '2015-01-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-02-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-03-01' and bought_something = 'Y'
intersect
select customer_id from mtrx where mm = date '2015-04-01' and bought_something = 'N'
;

这可能不是我的分析&#34;您的教授期待的解决方案,但它确实产生了正确的结果。找到obligatory SQL Fiddle here

调整最终结果以获取客户名称留给读者练习。同样在每个月将结果集细化为相同的项目:)