MySQL具有固定列数的Pivot数据

时间:2015-02-10 10:29:43

标签: mysql pivot

以下是我的SELECT声明,可以很好地调整我的数据。

我的数据看起来像这样:

col_a | col_b | col_c | col_d   | Score
-------------------------------------
stuff | stuff | stuff | null    |  5
stuff | stuff | stuff | title_a |  3
stuff | stuff | stuff | title_x |  4

我目前的Pivot声明如下:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    .....

这给了我以下结果:

col_a | col_b | col_c | Title   | Score | Title   | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x |   4

我想做的是检查更多标题,但我只想在数据透视表中有四列。最多只有2行需要转动到上面的记录。但col_d可以包含任何标题。

例如,我尝试了以下内容:

我的数据现在看起来像这样:

col_a | col_b | col_c | col_d    | Score
-------------------------------------
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_x  |  4
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_bx |  4

我的Pivot声明现在看起来像这样:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_bx' THEN `col_d` end) AS 'Second Title',
    MAX(CASE `col_d` WHEN 'title_bx' THEN `score` end) AS 'Score'
    .....

因为你可以看到我试图检查另一个标题,但是这只给了我六列,其中2列为null,因为在这种情况下这两行包含title_atitle_bx,所以中间两列填充null

我想从上面的数据输出:

col_a | col_b | col_c | Title   | Score | Title    | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x  |   4
stuff | stuff | stuff | title_a |   3   | title_bx |   4

所以我的问题是如何在col_d中检查多个可能的标题,并且只有4列。

2 个答案:

答案 0 :(得分:8)

这有点混乱,因为MySQL没有窗口函数,并且您希望在第一组Title / Score列中包含非常具体的值。您可以使用一些user variablescol_d不等于title_a的行创建行号来获得最终结果,然后将其连接回您的表格。

语法类似于以下内容:

select a.col_a, a.col_b, a.col_c,
  max(case when a.col_d = 'title_a' then a.col_d end) title1,
  max(case when a.col_d = 'title_a' then a.score end) score1,
  max(case when na.col_d <> 'title_a' then na.col_d end) title2,
  max(case when na.col_d <> 'title_a' then na.score end) score2
from yourtable a
left join
(
  -- need to generate a row number value for the col_d rows
  -- that aren't equal to title_a
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score,
    @num:=@num+1 rownum
  from yourtable n
  cross join
  (
    select @num:=0
  ) d
  where n.col_d <> 'title_a'
  order by  n.col_a, n.col_b, n.col_c, n.col_d
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
  -- in the event you have more than 2 row only return 2
  and na.rownum <= 2  
where a.col_d = 'title_a'  
group by a.col_a, a.col_b, a.col_c, na.rownum;

SQL Fiddle with Demo。这得到一个结果:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |

有人向我指出,如果你只有2个其他值,那么你可以简单地加入数据而不使用用户变量:

select distinct a.col_a, a.col_b, a.col_c,
  a.col_d title1,
  a.score score1,
  na.col_d title2,
  na.score score2
from yourtable a
left join
(
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score
  from yourtable n
  where n.col_d <> 'title_a'
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
where a.col_d = 'title_a';

SQL Fiddle with Demo。这给出了相同的结果:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |

根据您对col_acol_bcol_c中的数据实际拥有的内容,您可能需要更改此内容,但它可以为您提供所需的结果。

更新:根据您的评论,您不会知道col_d列中的值,但只需要将数据拆分为两个透视列,这个过程变得复杂,因为MySQL没有窗口功能。如果有NTILE函数,这将非常容易。 NTILE函数将行分配到特定数量的组中。在这种情况下,您的数据将分为两组。

我已修改SO用户this blog中的代码,Quassnoi以使用用户变量复制NTILE函数。变量用于创建两个东西,一个行号(在旋转期间使用)和ntile值。

代码将被修改为:

select 
  x.col_a,
  x.col_b,
  x.col_c,
  max(case when x.splitgroup = 1 then x.col_d end) as Title1,
  max(case when x.splitgroup = 1 then x.Score end) as Score1,
  max(case when x.splitgroup = 2 then x.col_d end) as Title2,
  max(case when x.splitgroup = 2 then x.Score end) as Score2
from
(
  select src.col_a, src.col_b, src.col_c, src.col_d, src.score,
    src.splitGroup,
    @row:=case when @prev=src.splitGroup then @row else 0 end +1 rownum,
    @prev:=src.splitGroup
  from
  (
    -- mimic NTILE function by splitting the total count of rows
    -- over the number of columns we want (2)
    select d.col_a, d.col_b, d.col_c, d.col_d, d.score, 
      FLOOR((@r * @n) / cnt) + 1 AS splitGroup
    from
    (
      select a.col_a, a.col_b, a.col_c, a.col_d, a.score, grp.cnt
      from yourtable a
      inner join 
      (
        select col_a, col_b, col_c, count(*) as cnt
        from yourtable
        where col_d is not null
        group by col_a, col_b, col_c
      ) grp
        on a.col_a = grp.col_a
        and a.col_b = grp.col_b
        and a.col_c = grp.col_c
      where a.col_d is not null
      order by a.col_a, a.col_b, a.col_c
    ) d
    cross join
    (
      -- @n is equal to the number of new pivoted columns we want
      select @n:=2, @group1:='N', @group2:='N', @group3:='N'
    ) v
    WHERE 
      CASE 
        WHEN @group1 <> col_a AND @group2<> col_b AND @group3 <> col_c 
          THEN @r := -1 
          ELSE 0 END IS NOT NULL
      AND (@r := @r + 1) IS NOT NULL
  ) src
  cross join
  (
    -- these vars are used to get the row number once the data is split
    -- this will be needed for the aggregate/group by on the final select
    select @row:=0, @prev:=1
  ) v2
  order by src.splitGroup
) x
group by x.col_a, x.col_b, x.col_c, x.rowNum;

SQL Fiddle with Demo。这给出了结果:

| COL_A | COL_B | COL_C |   TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|----------|--------|----------|--------|
| stuff | stuff | stuff |  title_a |      3 | title_tt |      1 |
| stuff | stuff | stuff | title_bx |      0 | title_qq |      1 |
| stuff | stuff | stuff |  title_x |      4 |  title_a |      8 |
| stuff | stuff | stuff | title_yy |      3 |  title_h |      4 |
| stuff | stuff | stuff |  title_a |      2 |  title_o |      6 |

答案 1 :(得分:0)

如果我理解你的话。你可以这样做:

SELECT `col_a`, `col_b`, `col_c`,
MAX(CASE WHEN `col_d` IN('title_a','title_x','title_bx') THEN `col_d` end) AS 'Title',
MAX(CASE WHEN `col_d` IN('title_a','title_x','title_bx') THEN `score` end) AS 'Score'
...