使用case语句来切割表格

时间:2015-01-27 16:45:57

标签: sql postgresql case

我有一个存储在postgres中的表,如下所示:

Column    RowIdx    Value
1         0         Dr A
1         1         Mr B
1         2         Mrs C
2         0         101
2         1         105
2         2         127
3         0         Red
3         1         Green
3         2         Blue
4         0         Miss D
4         1         Mr E
4         2         Ms F
5         0         23
5         1         197
5         2         256
6         0         Black
6         1         Brown
6         2         Yellow

当我想提取一个如下所示的表时:

Name    HumanID    FavouriteColor
Dr A    101        Red
Mr B    105        Green
Mrs C   127        Blue

我使用以下SQL语句:

SELECT 
    max(CASE WHEN Column=1 THEN Value ELSE null END) AS Name,
    max(CASE WHEN Column=2 THEN Value ELSE null END) AS HumanID,
    max(CASE WHEN Column=3 THEN Value ELSE null END) AS FavouriteColor

WHERE (Column=1 OR Column=2 OR Column=3) 

GROUP BY RowIdx 
ORDER BY RowIdx

这完全没问题。我现在想要导出一个如下所示的表:

Name    HumanID    FavouriteColor
Dr A    101        Red
Mr B    105        Green
Mrs C   127        Blue
Miss D  23         Black
Mr E    197        Brown
Mrs C   256        Yellow

我将我的SQL语句修改为:

SELECT 
    max(CASE WHEN (Column=1 OR Column=4) THEN Value ELSE null END) AS Name,
    max(CASE WHEN (Column=2 OR Column=5) THEN Value ELSE null END) AS HumanID,
    max(CASE WHEN (Column=3 OR Column=6) THEN Value ELSE null END) AS FavouriteColor

WHERE (Column=1 OR Column=2 OR Column=3 OR Column=4 OR Column=5 OR Column=6) 

GROUP BY RowIdx 
ORDER BY RowIdx

然而,这似乎不起作用,我只是再次获得第一张桌子。我知道足够多的sql可以使用,但我并不真正理解正在评估语句的顺序,这使我很难弄清楚为什么我得到的结果是我的。

任何人都可以对此有所了解吗?

编辑:只是为了提供更多背景信息。我的系统拥有数十万个csv文件。它是通过“堆叠”来实现的。将csv文件的列放入单个列,然后将其插入表中。我可以保证来自同一个csv文件的列/单元的排序,但我不能保证它在csv文件之间。我上面给出的示例代表了两个已导入的csv文件。每个文件都包含有关某人的信息。我试图实现的操作将允许我合并两个csv文件。

我可以通过导出一个表,然后导出另一个表,然后将正确的位复制到一起来实现结果。我希望通过这个SQL语句来实现它,因为我认为它更有效(由于系统中的其他限制)。

4 个答案:

答案 0 :(得分:1)

您需要一种方法来定义每个组/集

其中3需要更改为每组数据中的行数。

SELECT 
    max(CASE WHEN col%3=1 THEN Value END) AS Name,
    max(CASE WHEN col%3=2 THEN Value END) AS HumanID,
    max(CASE WHEN col%3=0 THEN Value END) AS FavouriteColor,
    round((col-1)/3,0) as set
FROM FOO
GROUP BY round((col-1)/3,0), rowidx 

http://sqlfiddle.com/#!15/20e22/13/0

答案 1 :(得分:0)

我认为这是您想要的查询:

SELECT max(CASE WHEN RowIdx = 1 THEN Value END) AS Name,
       max(CASE WHEN RowIdx = 2 THEN Value END) AS HumanID,
       max(CASE WHEN RowIdx = 3 THEN Value END) AS FavouriteColor
WHERE RowIdx in (1, 2, 3)
GROUP BY column 
ORDER BY column;

答案 2 :(得分:0)

您可以尝试为每个“人”动态生成一个人工唯一标识符,从第4-6列分区列1-3(从原始版本修改以消除子查询):

-- Edited so that "like" column numbers are now enumerated instead of modulo-computed
SELECT
  MAX(CASE WHEN col IN (1, 4) THEN Value ELSE NULL END) AS Name,
  MAX(CASE WHEN col IN (2, 5) THEN Value ELSE NULL END) AS HumanID,
  MAX(CASE WHEN col IN (3, 6) THEN Value ELSE NULL END) AS FavouriteColor
FROM foo  -- Just my name for your table
GROUP BY (ceil(col::FLOAT / 3), rowidx)
ORDER BY (ceil(col::FLOAT / 3), rowidx);

使用您的数据:

postgres=# select * from foo;
 col | rowidx | value
-----+--------+--------
   1 |      0 | Dr A
   1 |      1 | Mr B
   1 |      2 | Mrs C
   2 |      0 | 101
   2 |      1 | 105
   2 |      2 | 127
   3 |      0 | Red
   3 |      1 | Green
   3 |      2 | Blue
   4 |      0 | Miss D
   4 |      1 | Mr E
   4 |      2 | Ms F
   5 |      0 | 23
   5 |      1 | 197
   5 |      2 | 256
   6 |      0 | Black
   6 |      1 | Brown
   6 |      2 | Yellow
(18 rows)

postgres=# SELECT
postgres-#   MAX(CASE WHEN col IN (1, 4) THEN Value ELSE NULL END) AS Name,
postgres-#   MAX(CASE WHEN col IN (2, 5) THEN Value ELSE NULL END) AS HumanID,
postgres-#   MAX(CASE WHEN col IN (3, 6) THEN Value ELSE NULL END) AS FavouriteColor
postgres-# FROM foo  -- Just my name for your table
postgres-# GROUP BY (ceil(col::FLOAT / 3), rowidx)
postgres-# ORDER BY (ceil(col::FLOAT / 3), rowidx);
  name  | humanid | favouritecolor
--------+---------+----------------
 Dr A   | 101     | Red
 Mr B   | 105     | Green
 Mrs C  | 127     | Blue
 Miss D | 23      | Black
 Mr E   | 197     | Brown
 Ms F   | 256     | Yellow
(6 rows)

答案 3 :(得分:0)

所以我提出了以下解决方案:

DROP TABLE IF EXISTS "tempTable";
CREATE TEMP TABLE "tempTable" (name text, humanid text, color text);

-- Insert first csv table
INSERT INTO "tempTable" (SELECT 
    max(CASE WHEN columnid=1 THEN val ELSE null END) AS Name,
    max(CASE WHEN columnid=2 THEN val ELSE null END) AS HumanID,
    max(CASE WHEN columnid=3 THEN val ELSE null END) AS FavouriteColor

FROM sample
WHERE (columnid=1 OR columnid=2 OR columnid=3) 

GROUP BY rowidx 
ORDER BY rowidx);

-- Insert second csv table
INSERT INTO "tempTable" (SELECT 
    max(CASE WHEN columnid=4 THEN val ELSE null END) AS Name,
    max(CASE WHEN columnid=5 THEN val ELSE null END) AS HumanID,
    max(CASE WHEN columnid=6 THEN val ELSE null END) AS FavouriteColor

FROM sample
WHERE (columnid=4 OR columnid=5 OR columnid=6) 

GROUP BY rowidx 
ORDER BY rowidx);


SELECT * FROM "tempTable";

这表现得如预期,但我认为它不是最佳解决方案。