这两个查询应该给我相同的结果-SQL

时间:2019-03-07 23:14:27

标签: sql oracle-sqldeveloper

**您好,我正在寻找给定时间段内回头客和新客户的数量。我有两个查询,一个是仅用于查找新客户和回头客的查询,另一个是相同的查询,但将数据分为年龄段和性别。从技术上讲,这两个查询应该给我相同的总数,但它们的总数却不同。以下是查询内容,有人可以向我解释我真正遇到的问题是什么。

我创建的示例数据库中的总匹配项也与我的实际数据库中的不匹配。**

我还有一个示例数据库,在需要示例数据时可以参考。

https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=be7e1aec30f03edeb0cce246ff05721f

以下是仅用于获取新客户和回头客的查询:

新客户

SELECT
    DECODE(is_new, 1, 'New Customers', 'Returning Customers') type_of_customer,
    COUNT(distinct individual_id) count_of_customers,
    SUM(count_of_transactions) count_of_transactions,
    SUM(sum_of_quantity) sum_of_quantity
FROM (
    SELECT
    individual_id,
    SUM(dollar_value_us),
    sum(quantity) sum_of_quantity,
    count(distinct transaction_number) count_of_transactions,
    CASE WHEN MIN(txn_date) = min_txn_date THEN 1 ELSE 0 END is_new
FROM (
    SELECT 
        individual_id, 
        dollar_value_us,
        txn_date,
        quantity,
        transaction_number,
        MIN(txn_date) OVER(PARTITION BY individual_id) min_txn_date          
    FROM transaction_detail_mv   
    WHERE 
        brand_org_code = 'BRAND'
        AND is_merch = 1
        AND currency_code = 'USD'
        AND line_item_amt_type_cd = 'S'
)
WHERE 
    txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY') 
    AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
GROUP BY
    individual_id,
    min_txn_date
    )
x GROUP BY is_new

下面是将上面相同数据按年龄段和性别划分的查询,但是对于新客户和回头客,我有两个单独的查询。

对于新客户,查询为:

select gender, 
case when age < 18 then '<18'
when age between 18 and 24 then '18-24'
when age between 25 and 32 then '25-32'
when age between 33 and 39 then '35-39'
when age between 40 and 46 then '40-46'
when age between 47 and 53 then '46-52'
when age between 54 and 60 then '53-58'
when age > 60 then '61+' end as AgeGroup
, count(distinct individual_id) indiv
, count (distinct transaction_number) txn_count
, sum(dollar_value_us) as Spend
, sum(quantity) Qty

from (SELECT 
        a.individual_id, 
        a.dollar_value_us,
        a.txn_date,
        a.quantity,
        a.transaction_number,
        b.gender,
        b.age,
        MIN(txn_date) OVER(PARTITION BY a.individual_id) min_txn_date          
    FROM transaction_detail_mv   a
    join gender_details b on a.individual_id = b.individual_id
    WHERE 
        a.brand_org_code = 'BRAND'
        AND a.is_merch = 1
        AND a.currency_code = 'USD'
        AND a.line_item_amt_type_cd = 'S')

where txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY') 
    AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
    AND min_txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
    AND min_txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')


group by gender, 
case when age < 18 then '<18'
when age between 18 and 24 then '18-24'
when age between 25 and 32 then '25-32'
when age between 33 and 39 then '35-39'
when age between 40 and 46 then '40-46'
when age between 47 and 53 then '46-52'
when age between 54 and 60 then '53-58'
when age > 60 then '61+' end

回头客:

select gender, 
case when age < 18 then '<18'
when age between 18 and 24 then '18-24'
when age between 25 and 32 then '25-32'
when age between 33 and 39 then '35-39'
when age between 40 and 46 then '40-46'
when age between 47 and 53 then '46-52'
when age between 54 and 60 then '53-58'
when age > 60 then '61+' end as AgeGroup
, count(distinct individual_id) indiv
, count (distinct transaction_number) txn_count
, sum(dollar_value_us) as Spend
, sum(quantity) Qty

from (SELECT 
        a.individual_id, 
        a.dollar_value_us,
        a.txn_date,
        a.quantity,
        a.transaction_number,
        b.gender,
        b.age,
        MIN(txn_date) OVER(PARTITION BY a.individual_id) min_txn_date          
    FROM transaction_detail_mv   a
    join gender_details b on a.individual_id = b.individual_id
    WHERE 
        a.brand_org_code = 'BRAND'
        AND a.is_merch = 1
        AND a.currency_code = 'USD'
        AND a.line_item_amt_type_cd = 'S')

where txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY') 
    AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
    AND min_txn_date <TO_DATE('10-02-2019', 'DD-MM-YYYY')


group by gender, 
case when age < 18 then '<18'
when age between 18 and 24 then '18-24'
when age between 25 and 32 then '25-32'
when age between 33 and 39 then '35-39'
when age between 40 and 46 then '40-46'
when age between 47 and 53 then '46-52'
when age between 54 and 60 then '53-58'
when age > 60 then '61+' end

1 个答案:

答案 0 :(得分:2)

确保在WHERE子句中比较整个日期。如果时间值潜入数据中,则结果可能会不一致。

更新您的WHERE条款,以使新客户成为:

where TRUNC(txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
    AND TRUNC(txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')
    AND TRUNC(min_txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
    AND TRUNC(min_txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')

您的回头客是:

where TRUNC(txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
    AND TRUNC(txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')
    AND TRUNC(min_txn_date) <TO_DATE('10-02-2019', 'DD-MM-YYYY')

我还建议重构查询,以免将基于行的代码与基于集的代码混合使用。换句话说,在子查询/ WITH语句中进行逐行处理,然后进行汇总。这将使SQL更易于理解和维护。

示例1-新客户

SELECT
t.gender,
t.AgeGroup,
count(distinct t.individual_id) as indiv,
count(distinct t.transaction_number) as txn_count,
sum(t.dollar_value_us) as Spend,
sum(t.quantity) as Qty
from (
        SELECT
        a.individual_id,
        a.dollar_value_us,
        a.txn_date,
        a.quantity,
        a.transaction_number,
        b.gender,
        b.age,
        case
            when b.age < 18 then '<18'
            when b.age between 18 and 24 then '18-24'
            when b.age between 25 and 32 then '25-32'
            when b.age between 33 and 39 then '35-39'
            when b.age between 40 and 46 then '40-46'
            when b.age between 47 and 53 then '46-52'
            when b.age between 54 and 60 then '53-58'
            when b.age > 60 then '61+'
        end as AgeGroup,
        MIN(a.txn_date) OVER (PARTITION BY a.individual_id) as min_txn_date
        FROM transaction_detail_mv a
        inner join gender_details b on b.individual_id = a.individual_id
        WHERE a.brand_org_code = 'BRAND'
        AND a.is_merch = 1
        AND a.currency_code = 'USD'
        AND a.line_item_amt_type_cd = 'S'
    ) t
where TRUNC(t.txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND TRUNC(t.txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')
AND TRUNC(t.min_txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND TRUNC(t.min_txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')
group by t.gender,
         t.AgeGroup;

示例2-回头客

select
t.gender,
t.AgeGroup,
count(distinct individual_id) as indiv,
count (distinct transaction_number) as txn_count,
sum(dollar_value_us) as Spend,
sum(quantity) as Qty
from (
        SELECT
        a.individual_id,
        a.dollar_value_us,
        a.txn_date,
        a.quantity,
        a.transaction_number,
        b.gender,
        b.age,
        case
            when b.age < 18 then '<18'
            when b.age between 18 and 24 then '18-24'
            when b.age between 25 and 32 then '25-32'
            when b.age between 33 and 39 then '35-39'
            when b.age between 40 and 46 then '40-46'
            when b.age between 47 and 53 then '46-52'
            when b.age between 54 and 60 then '53-58'
            when b.age > 60 then '61+'
        end as AgeGroup,
        MIN(txn_date) OVER(PARTITION BY a.individual_id) as min_txn_date
        FROM transaction_detail_mv a
        inner join gender_details b on b.individual_id = a.individual_id
        WHERE a.brand_org_code = 'BRAND'
        AND a.is_merch = 1
        AND a.currency_code = 'USD'
        AND a.line_item_amt_type_cd = 'S'
    ) t
where TRUNC(t.txn_date) >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND TRUNC(t.txn_date) < TO_DATE('17-02-2019', 'DD-MM-YYYY')
AND TRUNC(t.min_txn_date) <TO_DATE('10-02-2019', 'DD-MM-YYYY')
group by t.gender,
         t.AgeGroup;