使用可变列进行Postgres数据聚合

时间:2014-03-02 18:30:29

标签: sql database postgresql pivot crosstab

我有一个包含时间日志信息的数据表。

create table "time_records" (
    "id" serial NOT NULL PRIMARY KEY,
    "start" timestamp not null,
    "end" timestamp not null,
    "duration" double precision not null,
    "project" varchar(255) not null,
    "case" integer not null,
    "title" text not null,
    "user" varchar(255) not null
);

以下是几行数据:

"id","start","end","duration","project","case","title","user"
"1","2014-02-01 11:54:00","2014-02-01 12:20:00","26.18","Project A","933","Something done here","John Smith"
"2","2014-02-02 12:34:00","2014-02-02 15:00:00","146","Project B","990","Something else done","Joshua Kehn"
"3","2014-02-02 17:57:00","2014-02-02 18:39:00","41.38","Project A","933","Another thing done","Bob Frank"
"4","2014-02-03 09:30:00","2014-02-03 11:41:00","131","Project A","983","iOS work","Joshua Kehn"
"5","2014-02-03 10:22:00","2014-02-03 13:29:00","187.7","Project C","966","Created views for things","Alice Swiss"

我可以从中提取一些信息。例如,在两个日期之间记录时间的每个项目的列表或在两个日期之间工作的每个人的列表。

我希望能够生成一个包含日期的报告,然后生成顶部的每个项目,并记录该项目的总时间。

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
WHERE
    project = 'Project A'
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

但是我想在输出上有多个列,因此以某种方式将select distinct project与此组合。

最终输出将是:

date, project a total, project b total, project c total,
2014-02-01,0.5, 0.3, 10,
2014-02-02,1.3, 20, 3,
2014-02-03,20, 10, 10
...

我可以通过以下方式获得每个项目的总日期:

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

但是我每个项目的行数都有多个日期。我需要将它作为一个日期,每个项目总计在不同的行上。

这是否有意义/只有SQL在查询后没有编写代码?

2 个答案:

答案 0 :(得分:1)

一种简单的方法是做一个"手册"使用CASE;

进行透视
SELECT DATE("start"),
 SUM(CASE WHEN "project"='Project A' THEN duration/60 ELSE 0 END) "Project A",
 SUM(CASE WHEN "project"='Project B' THEN duration/60 ELSE 0 END) "Project B",
 SUM(CASE WHEN "project"='Project C' THEN duration/60 ELSE 0 END) "Project C"
FROM time_records
GROUP BY DATE("start"); 

An SQLfiddle to test with

您应该可以使用CROSSTAB()执行类似操作,但我无法访问PostgreSQL以加载模块并使用以下方法测试查询: - /

答案 1 :(得分:1)

对于“数据透视”表或交叉制表,请使用crosstab() function of the additional module tablefunc

表定义

鉴于此清理后的表定义没有reserved SQL key words作为标识符(这是一个很大的禁忌,即使你可以强制使用双引号):

CREATE TEMP TABLE time_records (
    id serial PRIMARY KEY,
    t_start timestamp not null,
    t_end timestamp not null,
    duration double precision not null,
    project text not null,
    t_case integer not null,
    title text not null,
    t_user text not null
);

查询

请注意我如何使用带有两个参数的变体来正确处理结果中的缺失项目。

SELECT *
FROM  crosstab (
   $$
   SELECT t_start::date
         , project
         , round(sum(duration / 60)::numeric, 2) AS time_logged
   FROM    time_records
   GROUP   BY 1,2
   ORDER   BY 1,2
   $$
  ,$$VALUES ('Project A'), ('Project B'),('Project C')$$
  ) AS t (
      t_start   date
    , project_a text
    , project_b text
    , project_c text
  );

结果:

t_start    | project_a | project_b | project_c
-----------|-----------|-----------|----------
2014-02-01 | 0.44      |           |
2014-02-02 | 0.69      | 2.43      |
2014-02-03 | 2.18      |           | 3.13

使用Postgres 9.3进行测试。

此相关答案中的解释,详情和链接:
PostgreSQL Crosstab Query