我有这个记录:
Employee_Number Employee_role Group_Name
----------------------------------------------------
EMP101 C# Developer Group_1
EMP102 ASP Developer Group_1
EMP103 SQL Developer Group_2
EMP104 PLSQL Developer Group_2
EMP101 Java Developer
EMP102 Web Developer
EMP101 DBA
EMP105 DBA
EMP106 SQL Developer Group_3
EMP107 Oracle Developer Group_3
EMP101 Oracle Developer Group_3
想要按以下格式显示上述记录的数据透视表:
Employee_Number TotalRoles TotalGroups Available Others Group_1 Group_2 Group_3
-----------------------------------------------------------------------------------------------------------------
EMP101 4 3 2 2 1 1
EMP102 2 3 1 1 1
EMP103 1 3 1 0 1
EMP104 1 3 1 0 1
EMP105 1 3 0 1
EMP106 1 3 1 0 1
EMP107 1 3 1 0 1
对于上述结果,我使用以下脚本:
SELECT * FROM crosstab(
$$SELECT grp.*, e.group_name
, CASE WHEN e.employee_number IS NULL THEN 0 ELSE 1 END AS val
FROM (
SELECT employee_number
, count(employee_role)::int AS total_roles
, (SELECT count(DISTINCT group_name)::int
FROM employee
WHERE group_name <> '') AS total_groups
, count(group_name <> '' OR NULL)::INT AS available
, count(group_name = '' OR NULL)::int AS others
FROM employee
GROUP BY employee_number
) grp
LEFT JOIN employee e ON e.employee_number = grp.employee_number
AND e.group_name <> ''
ORDER BY grp.employee_number, e.group_name$$
,$$VALUES ('Group_1'),('Group_2'),('Group_3')$$
) AS ct (employee_number text
, total_roles int
, total_groups int
, available int
, others int
, "Group_1" int
, "Group_2" int
, "Group_3" int);
但是:现在我想通过过滤 Group_Name
来显示上述记录的数据透视表。
这意味着如果我想显示唯一的Group_Name= Group_3
的数据透视表,那么它有
仅显示仅属于Group_Name= Group_3
的员工而非其他员工。
如果我想看到属于Group_3
的员工,只有它必须告诉我:
Employee_Number total_roles total_groups available others Group_3
-------------------------------------------------------------------------------
EMP106 1 3 1 0 1
EMP107 1 3 1 0 1
注意:正如您在第一个表中看到的那样,员工EMP106
和EMP107
仅属于
到Group_Name = Group_3
。员工EMP101
也属于,但他也属于其他组
所以不应该出现在这张表中。
答案 0 :(得分:1)
如何排除有问题的行:
适用的crosstab()
版本:
SELECT * FROM crosstab(
$$SELECT grp.*, e.group_name
, CASE WHEN e.employee_number IS NULL THEN 0 ELSE 1 END AS val
FROM (
SELECT employee_number
, count(employee_role)::int AS total_roles
, (SELECT count(DISTINCT group_name)::int
FROM employee
WHERE group_name <> '') AS total_groups
, count(group_name <> '' OR NULL)::int AS available
, count(group_name = '' OR NULL)::int AS others
FROM employee
GROUP BY employee_number
) grp
JOIN employee e USING (employee_number)
WHERE e.group_name = 'Group_3'
AND NOT EXISTS (
SELECT 1 FROM employee
WHERE employee_number = e.employee_number
AND group_name e.group_name
)
ORDER BY employee_number$$
,$$VALUES ('Group_3')$$
) AS ct (employee_number text
, total_roles int
, total_groups int
, available int
, others int
, "Group_3" int);
但正如您所看到的,我们根本不需要crosstab()
。简化为:
SELECT grp.*, 1 AS "Group_3"
FROM (
SELECT employee_number
, count(employee_role)::int AS total_roles
, (SELECT count(DISTINCT group_name)::int
FROM employee
WHERE group_name <> '') AS total_groups
, count(group_name <> '' OR NULL)::int AS available
, count(group_name = '' OR NULL)::int AS others
FROM employee
GROUP BY employee_number
) grp
JOIN employee e USING (employee_number)
WHERE e.group_name = 'Group_3'
AND NOT EXISTS (
SELECT 1 FROM employee
WHERE employee_number = e.employee_number
AND group_name <> e.group_name
)
ORDER BY employee_number;
列"Group_3"
实际上只是噪音,因为根据定义它总是1
。
如果以这种方式只选择了一小部分行,则此版本与LATERAL
联接应该大大加快:
SELECT e.employee_number
, grp.total_roles
, total.total_groups
, grp.available
, grp.others
, 1 AS "Group_3"
FROM (
SELECT employee_number
FROM employee e
WHERE group_name = 'Group_3'
AND NOT EXISTS (
SELECT 1 FROM employee
WHERE employee_number = e.employee_number
AND group_name <> e.group_name
)
) e
, LATERAL (
SELECT count(employee_role)::int AS total_roles
, count(group_name <> '' OR NULL)::int AS available
, count(group_name = '' OR NULL)::int AS others
FROM employee
WHERE employee_number = e.employee_number
GROUP BY employee_number
) grp
, (
SELECT count(DISTINCT group_name)::int AS total_groups
FROM employee
WHERE group_name <> ''
) total
ORDER BY employee_number;
LATERAL
解决方案和效果的详细信息:
未针对性能进行优化,但易于适应:
<original crosstab query from your question>
WHERE "Group_3" = 1
AND "Group_1" IS NULL
AND "Group_2" IS NULL
AND "Group_4" IS NULL
AND others = 0 -- to rule out membership in the "empty" group
-- possibly more ...