我使用PostgreSQL 9.3.9运行两个不同的查询,这些查询会产生不同的结果,但这两个查询都按照"月 - 年"进行分组。我想知道如何创建一个查询来在 one 表中为我提供相同的数据?
查询1:
SELECT CONCAT(EXTRACT(MONTH FROM startedPayingDate), '-',
EXTRACT(YEAR FROM startedPayingDate)) AS "Month",
COUNT(*) AS "Total AB Paying Customers"
FROM (
SELECT cm.customer_id, MIN(cm.created_at) AS startedPayingDate
FROM customerusermap AS cm, users as u
WHERE cm.customer_id = u.customer_id AND cm.user_id<>u.id
GROUP BY cm.customer_id ) AS t
GROUP BY 1, EXTRACT(MONTH FROM startedPayingDate), EXTRACT(YEAR FROM startedPayingDate)
ORDER BY EXTRACT(YEAR FROM startedPayingDate), EXTRACT(MONTH FROM startedPayingDate);
结果如下:
Month | Total AB Paying Customers
---------------------------------
3-2014 | 2
4-2014 | 4
查询2:
SELECT concat(extract(MONTH from u.created_at),'-',extract(year from u.created_at)) as "Month",
count(u.email) as "Total SMB Paying Customers"
FROM customerusermap AS cm, users AS u
WHERE cm.customer_id = u.customer_id AND cm.user_id = u.id AND u.paid_status = 'paying'
GROUP by 1,extract(month from u.created_at),extract(year from u.created_at)
order by extract(year from u.created_at),extract(month from u.created_at);
结果如下:
Month | Total SMB Paying Customers
-----------------------------------
2-2014 | 3
3-2014 | 8
4-2014 | 5
我想将这两个查询合并为如图所示的结果,并按年份和月份(即最旧到最新)进行排序:
Month | Total AB Paying Customers | Total SMB Paying Customers | Total | Cumulative
-------------------------------------------------------------------------------------
2-2014 | 0 | 3 | 3 | 3
3-2014 | 2 | 8 | 10 | 13
4-2014 | 4 | 5 | 9 | 22
CREATE TABLE users (
id serial NOT NULL,
firstname character varying(255) NOT NULL,
lastname character varying(255) NOT NULL,
email character varying(255) NOT NULL,
created_at timestamp without time zone NOT NULL DEFAULT now(),
customer_id character varying(255) DEFAULT NULL::character varying,
companyname character varying(255),
primary_user_id integer,
paid_status character varying(255), -- updated from comment
CONSTRAINT users_pkey PRIMARY KEY (id),
CONSTRAINT primary_user_id_fk FOREIGN KEY (primary_user_id) REFERENCES users (id),
CONSTRAINT users_uuid_key UNIQUE (uuid)
)
而customerusermap表如下所示:
CREATE TABLE customerusermap (
id serial NOT NULL,
user_id integer NOT NULL,
customer_id character varying(255) NOT NULL,
created_at timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT customerusermap_pkey PRIMARY KEY (id),
CONSTRAINT customerusermap_user_id_fkey FOREIGN KEY (user_id) REFERENCES users (id),
CONSTRAINT customerusermap_user_id_key UNIQUE (user_id)
);
答案 0 :(得分:1)
关键功能是 FULL OUTER JOIN
,但正确处理NULL值:
SELECT *
, "Total AB Paying Customers" + "Total SMB Paying Customers" AS "Total"
, sum("Total AB Paying Customers" + "Total SMB Paying Customers")
OVER (ORDER BY "Month") AS "Cumulative"
FROM (
SELECT "Month"
, COALESCE(q1."Total AB Paying Customers", 0) AS "Total AB Paying Customers"
, COALESCE(q2."Total SMB Paying Customers", 0) AS "Total SMB Paying Customers"
FROM (<query1>) q1
FULL JOIN (<query2>) q2 USING ("Month")
) sub;
使用sum()
作为累计金额的window function
附加子查询图层仅为方便起见,因此我们不必经常添加COALESCE()
。
查询可以进一步简化:格式化外部SELECT
中的月份等
根据您添加的设置:
SELECT to_char(mon, 'FMMM-YYYY') AS "Month"
, ct_ab AS "Total AB Paying Customers"
, ct_smb AS "Total SMB Paying Customers"
, ct_ab + ct_smb AS "Total"
, sum(ct_ab + ct_smb) OVER (ORDER BY mon)::int AS "Cumulative"
FROM (
SELECT mon, COALESCE(q1.ct_ab, 0) AS ct_ab, COALESCE(q2.ct_smb, 0) AS ct_smb
FROM (
SELECT date_trunc('month', start_date) AS mon, count(*)::int AS ct_ab
FROM (
SELECT cm.customer_id, min(cm.created_at) AS start_date
FROM customerusermap cm
JOIN users u USING (customer_id)
WHERE cm.user_id <> u.id
GROUP BY 1
) t
GROUP BY 1
) q1
FULL JOIN (
SELECT date_trunc('month', u.created_at) AS mon, count(*)::int AS ct_smb
FROM customerusermap cm
JOIN users u USING (customer_id)
WHERE cm.user_id = u.id AND u.paid_status = 'paying'
GROUP BY 1
) q2 USING (mon)
) sub;
ORDER BY mon;
使用to_char()
以您喜欢的方式格式化您的月份。并且最后只需一次。 template pattern FMMM
生成的月号不带前导零,就像您原来的一样。
使用date_trunc()
来确定您的timestamp without time zone
到月份的分辨率(当月的第一个时间戳,但这没有区别)。
我添加ORDER BY mon
以获得您评论的排序顺序。由于专栏mon
仍为timestamp
(尚未转换为字符串(text
),因此符合预期。
由于u.email
定义为NOT NULL
,count(*)
在此上下文中与count(u.email)
相同,但便宜一点。
使用显式JOIN
语法。相同的表现,但更清晰。
我将汇总计数转换为integer
。这完全是可选(假设你没有整数溢出)。因此,您在结果中包含所有整数,而不是bigint
和numeric
与原版相比,你会发现它更短,更快。
如果性能很重要,请确保在相关列上有索引。如果users
中有多个条目到customerusermap
中的一个条目,那么JOIN LATERAL
有更复杂的选项可以让您的查询更快:
如果您想要将没有任何活动的月份包括在内,请将LEFT JOIN
添加到完整的月份列表中。例如: