从MySQL中的行动态创建列

时间:2014-01-20 19:00:56

标签: mysql sql pivot crosstab

我有以下表格:

"crawlresults"
id  |   url                 | fk_crawljobs_id
---------------------------------------------
1   |   shop*com/notebooks  |   1
2   |   shop*com/fridges    |   1
3   |   website*com/lists   |   2


"extractions"
id  | fk_extractors_id  | data          |   fk_crawlresults_id
---------------------------------------------------------------
1   |   1               | 123.45        |   1
2   |   2               | notebook      |   1
3   |   3               | ibm.jpg       |   1
4   |   1               | 44.5          |   2
5   |   2               | fridge        |   2
6   |   3               | picture.jpg   |   3
7   |   4               | hello         |   3
8   |   4               | world         |   3
9   |   5               | hi            |   3
10  |   5               | my            |   3
11  |   5               | friend        |   3


"extractors"
id  |   extractorname
----------------------
1   |   price
2   |   article
3   |   imageurl
4   |   list_1
5   |   list_2

我需要构造一个select语句来获取提取器表中提取器表中每个提取器的列。

示例:

url                 | price     | article   | imageurl
--------------------------------------------------------
shop*com/notebooks  | 123.45    | notebook  | ibm.jpg
shop*com/fridges    | 44.5      | fridge    | NULL

执行select语句时,我没有多少提取符存在,因此必须动态构建。

修改 我忘了提到我的提取中可能有多个“列表”。在这种情况下,我需要以下结果集。

示例2:

url                 | list_1    | imageurl      | list_2
--------------------------------------------------------
website*com/lists   | hello     | picture.jpg   | NULL
website*com/lists   | world     | picture.jpg   | NULL
website*com/lists   | NULL      | picture.jpg   | hello
website*com/lists   | NULL      | picture.jpg   | my
website*com/lists   | NULL      | picture.jpg   | friend

谢谢!

1 个答案:

答案 0 :(得分:3)

您正在寻找Dynamic pivot tables

代码:

SET @sql = NULL;
SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      'MAX(IF(pa.extractorname = ''',
      extractorname,
      ''', p.data, NULL)) AS ',
      extractorname
    )
  ) INTO @sql
FROM extractors;

SET @sql = CONCAT('SELECT c.url, ', 
  @sql, 
  ' FROM crawlresults c', 
  ' INNER JOIN extractions p on (c.id = p.fk_crawlresults_id)', 
  ' INNER JOIN extractors pa on (p.fk_extractors_id = pa.id)'
  ' WHERE c.fk_crawljobs_id = 1',
  ' GROUP BY c.id');

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Working fiddle


基本上,您的原始查询生成了一个虚假@sql变量,该变量并未真正为每个data提取extractorname。您也不需要所有这些联接来创建@sql。您只需要每个属性名称(来自extractor表)和对包含期望值(data)的列的引用。

如果对结构有疑问,请为一组固定的属性写出一个简单的数据透视查询。通过这种方式,可以轻松识别用于编写动态查询的模式。

SELECT c.url, 
  MAX(IF(pa.extractorname = 'price', p.data, NULL)) AS price,
  MAX(IF(pa.extractorname = 'article', p.data, NULL)) AS article,
  MAX(IF(pa.extractorname = 'imageurl', p.data, NULL)) AS imageurl 
FROM crawlresults c 
  LEFT JOIN extractions p on (c.id = p.fk_crawlresults_id) 
  LEFT JOIN extractors pa on (p.fk_extractors_id = pa.id) 
WHERE c.fk_crawljobs_id = 1
GROUP BY c.id

至于你的其余查询,这很好,请记住,如果某些LEFT JOINS没有extractionscrawlresults可能会有用。此外,如果您的表格可以包含多个crawlresult url / fk_crawljobs_id,则按url进行分组不是一个好主意(MAX可能会混淆多个extractions)。