从每日数据中选择一个月中计数超过4的数据

时间:2018-09-23 05:55:01

标签: sql snowflake-datawarehouse

我试图从包含商人日常交易的表中计算过去两年中每月至少进行4次交易的商人的每月数量(及其处理的总交易金额)。

我的查询如下:

SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)
FROM
(
  SELECT
    DATE_TRUNC('month', transactions.payment_date) AS month,
    merchants,
    COUNT(DISTINCT payment_id) AS volume,
    SUM(transactions.payment_amount) AS amount
  FROM transactions
  WHERE transactions.date >= NOW() - INTERVAL '2 years'
  GROUP BY 1, 2
) AS trx
WHERE trx.volume >= 4

我的问题是:此查询将提取正确的数据吗?如果是这样,这是最有效的编写方式,还是可以提高此查询的性能?

2 个答案:

答案 0 :(得分:1)

首先,我们必须考虑时间范围。您说过去24个月每月至少要进行四笔交易。但是,例如,在2018年10月10日运行查询时,例如在2018年10月,您当然不需要此。您也不想只查看2016年10月的最后二十天。我们想看一下2016年10月完整版到2018年9月完整版。

接下来,我们要确保一个商人每月至少进行四笔交易。换句话说:他们每个月都有交易,并且每月最少交易数为四。我们可以使用窗口功能来运行每月的交易以进行检查。

select merchants, month, volume, amount
from
(
  select
    merchants,
    date_trunc('month', payment_date) as month, 
    count(distinct payment_id) as volume,
    sum(payment_amount) as amount,
    count(*) over (partition by merchants) number_of_months,
    min(count(distinct payment_id)) over (partition by merchants) min_volume
  from transactions
  where date between date_trunc('month', current_date) - interval '24 months'
                 and date_trunc('month', current_date) - interval '1 days'
  group by merchants, date_trunc('month', payment_date)
) monthly
where number_of_months = 24
  and min_volume >= 4
order by merchants, month;

这为您提供了满足要求的商家列表及其每月数据。如果您想要更多的商人,则合计。例如

select count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4;

select month, count(distinct merchants), sum(amount) as total
from (...) monthly
where number_of_months = 24 and min_volume >= 4
group by month
order by month;

答案 1 :(得分:0)

仅获取可用于过滤不同的payement_id和month数量的汇总值结果的商家列表

SELECT merchants
FROM transactions
WHERE transactions.date >= NOW() - INTERVAL '2 years'
GROUP BY merchants
having count(distinct DATE_TRUNC('month', transactions.payment_date))  =24
  and COUNT(DISTINCT payment_id) >= 4

为您更新的问题只是一个建议

您可以与以下查询一起加入该查询:返回两个年度中每个月超过4的行进者,并使用

过滤结果以直接在子查询中进行汇总
    SELECT trx.month, COUNT(trx.merchants), SUM(trx.amount)

    FROM (

        SELECT DATE_TRUNC('month', transactions.payment_date) AS month
            , merchants
            , COUNT(DISTINCT payment_id) AS volume
            , SUM(transactions.payment_amount) AS amount
        FROM transactions
        INNER JOIN (
         SELECT merchants
            FROM transactions
            WHERE transactions.date >= NOW() - INTERVAL '2 years'
            GROUP BY merchants
            having count(distinct DATE_TRUNC('month', transactions.payment_date))  =24
            and COUNT(DISTINCT payment_id) >= 4
        ) A on A.merchant = transactions.merchant
        WHERE transactions.date >= NOW() - INTERVAL '2 years'

        GROUP BY 1, 2
        HAVING volume >= 4
    ) AS trx