我使用PostgreSQL 9.3.9并且我有一个名为list_all_upsells的程序,该程序在一个月的月初和一个月的结束时进行。 (有关示例数据,请参阅sqlfiddle.com/#!15/abd02)例如,以下代码将列出10月份的已加密帐户的计数:
select COUNT(up.*) as "Total Upsell Accounts in October" from
list_all_upsells('2015-10-01 00:00:00'::timestamp, '2015-10-31 23:59:59'::timestamp) as up
where up.user_id not in
(select distinct user_id from paid_users_no_more
where concat(extract(month from payment_stop_date),'-',extract(year from payment_stop_date))<>
concat(extract(month from payment_start_date),'-',extract(year from payment_start_date)));
list_all_upsells程序如下所示:
DECLARE
payor_email_2 text;
BEGIN
FOR payor_email_2 in select distinct payor_email from paid_users LOOP
return query
execute
'select paid_users.* from paid_users,
(
select payment_start_date as first_time from paid_users
where payor_email = $3
order by payment_start_date limit 1
) as dummy
where payor_email = $3
and payment_start_date > first_time
and payment_start_date between $1 and $2
and first_time < $1'
using a, b, payor_email_2;
END LOOP;
return;
END
我希望能够在我们有记录的所有月份运行这个并在一个表中查询数据,如下所示:
Month | Total Upselled Accounts
---------------------------------
08/2014 | 23
09/2014 | 35
ETC...
10/2015 | 56
我有一个查询要抓住每个月的第一个月和每个月的最后一个月我们一直在做生意:
select distinct date_trunc('month', payment_start_date)::date as startmonth
from paid_users ORDER BY startmonth;
上个月:
SELECT distinct (date_trunc('MONTH', payment_start_date) +
INTERVAL '1 MONTH - 1 day')::date as endmonth from paid_users
ORDER BY endmonth;
现在我将如何创建一个循环遍历list_all_upsells
的函数并获取这几个月的每个月的计数?即startmonth
的第一个查询给了我2014-03-01,2014-04-01,...到2015-10-01,而endmonth
的第二个查询给了我2014-03-31, 2014-04-30,...至2015-10-31。我希望在这几个月的每个月都运行list_all_sells
,这样我就可以每个月获得一个总计数来计算我们拥有的多少帐户
我的paid_users
表格如下:
CREATE TABLE paid_users
(
user_id integer,
user_email character varying(255),
payor_id integer,
payor_email character varying(255),
payment_start_date timestamp without time zone DEFAULT now()
)
paid_users_no_more
:
CREATE TABLE paid_users_no_more
(
user_id integer,
payment_stop_date timestamp without time zone DEFAULT now()
)
答案 0 :(得分:3)
你的功能有几个问题,所以让我们从那里开始吧。缺点是(1)你只需要一个参数来表示月份,使用月份的开始和结束就可以解决问题; (2)您不需要动态查询,因为您没有更改标识符(表名或列名); (3)你不需要循环; (4)你的逻辑错了。我还可以提到PostgreSQL使用函数并且它们都以CREATE FUNCTION list_all_upsells(...)
之类的行开头,但这样太挑剔了。
从逻辑开始:显然,由他的电子邮件地址标识的用户从某个payment_start_date
获取订阅,直到某个payment_stop_date
并且可以多次执行此操作。您正在寻找那些在相关月份之前购买了第一个订阅的用户,以及在相关月份开始新订阅而不是第一个订阅的用户。在这种情况下,过滤器payment_start_date > first_time
是无用的,因为您已经过滤了相关月份之前的第一个订阅(first_time < $1
)和新订阅(payment_start_date BETWEEN $1 AND $2
)。
点(1),(2)和(3)在重写函数内的查询时真的很明显:
CREATE FUNCTION list_all_upsells(timestamp) RETURNS SETOF paid_users AS $$
SELECT paid_users.*
FROM paid_users
JOIN ( -- This JOIN keeps only those rows where the payor_email has a prior subscription
SELECT DISTINCT payor_email,
first_value(payment_start_date) OVER (PARTITION BY payor_email ORDER BY payment_start_date) AS dummy
FROM paid_users
WHERE payment_start_date < date_trunc('month', $1)
) dummy USING (payor_email)
-- This filter keeps only those rows with new subscriptions in the month
WHERE date_trunc('month', payment_start_date) = date_trunc('month', $1)
$$ LANGUAGE sql STRICT;
由于函数体已缩减为单个SQL语句,因此该函数现在是sql
语言函数,比plpgsql
更有效。您现在只提供一个参数,该参数可以是您想要数据的月份中的任何时刻,因此list_all_upsells(LOCALTIMESTAMP)
将为您提供当月的结果。就您发布的查询而言,它将是:
SELECT count(up.*) AS "Total Upsell Accounts in October"
FROM list_all_upsells(LOCALTIMESTAMP) up
WHERE up.user_id NOT IN
(SELECT DISTINCT user_id FROM paid_users_no_more
WHERE date_trunc('month', payment_stop_date) <>
date_trunc('month', up.payment_start_date)
);
顺便说一下,这确实引起了为什么你有表paid_users_no_more
的问题。为什么不简单地向表payment_stop_date
添加一列paid_users
?该列为NULL
的用户仍在订阅。但整个查询相当奇怪,因为list_all_upsells()
在月内返回新的订阅,所以为什么还要在其他时间取消订阅?
现在回答你真正的问题:
SELECT months.m "Month", coalesce(count(up.*), 0) "Total Upselled Accounts"
FROM generate_series('2014-08-01'::timestamp,
date_trunc('month', LOCALTIMESTAMP),
'1 month') AS months(m)
LEFT JOIN list_all_upsells(months.m) AS up ON date_trunc('month', payment_start_date) = m
GROUP BY 1
ORDER BY 1;
Generate a series of months从某个开始月到当月,然后计算每个月的新订阅,可能为0。