Question

我有一张表如下：

表格提取

    Owner   | Attribute | value
----------------------------------------------------
    10      | COLOR     | BLUE
    10      | COLOR     | RED
    10      | COLOR     | GREEN
    10      | SIZE      | BIG
    20      | COLOR     | GREEN
    20      | SIZE      | MEDIUM
    20      | MEMORY    | 16G
    20      | MEMORY    | 32G
    30      | COLOR     | RED
    30      | COLOR     | BLUE
    30      | MEMORY    | 64G

是否有SQL将使用单个索引计算所有属性的组合（结果中的最后一列）：

Owner   | Attribute | Value | Rule_No
10      | COLOR     | BLUE  | 1
10      | SIZE      | BIG   | 1
10      | COLOR     | RED   | 2
10      | SIZE      | BIG   | 2
10      | COLOR     | GREEN | 3
10      | SIZE      | BIG   | 3
20      | COLOR     | GREEN | 1
20      | SIZE      | MEDIUM| 1
20      | MEMORY    | 16G   | 1
20      | COLOR     | GREEN | 2
20      | SIZE      | MEDIUM| 2
20      | MEMORY    | 32G   | 2
30      | COLOR     | BLUE  | 1
30      | MEMORY    | 64G   | 1
30      | COLOR     | RED   | 2
30      | MEMORY    | 64G   | 2

规则编号对每个所有者而言是唯一的（所有者'10'的规则'1'与所有者'20'的规则'1'无关。

我尝试使用SQL交叉连接，但属性数量不固定，然后我不能使用它（每个属性需要一个交叉连接），我希望组合是新行而不是新列。

我正在尝试使用Talend Open Studio - Data Integration来执行此操作，但仅使用SQL的解决方案对我来说会更好。

Answer 1

你真的想要问题中给出的表单中的数据（这需要Rule_No上的进一步聚合才能在最有可能的情况下使用），或者你最终是否正在寻求转动它？也就是说，规则连接在一起（每个属性成为自己的列），如下所示：

+---------+-------+-------+--------+--------+
| Rule_No | Owner | COLOR | SIZE   | MEMORY |
+---------+-------+-------+--------+--------+
|       1 |    10 | BLUE  | BIG    | NULL   |
|       2 |    10 | RED   | BIG    | NULL   |
|       3 |    10 | GREEN | BIG    | NULL   |
|       1 |    20 | GREEN | MEDIUM | 16G    |
|       2 |    20 | GREEN | MEDIUM | 32G    |
|       1 |    30 | RED   | NULL   | 64G    |
|       2 |    30 | BLUE  | NULL   | 64G    |
+---------+-------+-------+--------+--------+

可以使用以下查询来转移此类数据：

SELECT   @t:=IF(Owner=@o,@t,0)+1 AS Rule_No,
         @o:=Owner AS Owner,
         `COLOR`,`SIZE`,`MEMORY`
FROM     (SELECT DISTINCT Owner, @t:=0 FROM my_table) t0

  LEFT JOIN (
    SELECT Owner, value AS `COLOR`
    FROM   my_table
    WHERE  Attribute='COLOR'
  ) AS `t_COLOR` USING (Owner)

  LEFT JOIN (
    SELECT Owner, value AS `SIZE`
    FROM   my_table
    WHERE  Attribute='SIZE'
  ) AS `t_SIZE` USING (Owner)

  LEFT JOIN (
    SELECT Owner, value AS `MEMORY`
    FROM   my_table
    WHERE  Attribute='MEMORY'
  ) AS `t_MEMORY` USING (Owner)

ORDER BY Owner, Rule_No

由于属性列表是动态的，因此可以使用查询来构造上述SQL，从中准备并执行语句：

SELECT CONCAT('
         SELECT   @t:=IF(Owner=@o,@t,0)+1 AS Rule_No,
                  @o:=Owner AS Owner,
                  ', GROUP_CONCAT(DISTINCT CONCAT(
                    '`',REPLACE(Attribute,'`','``'),'`'
                  )), '
         FROM     (SELECT DISTINCT Owner, @t:=0 FROM my_table) t0
       ', GROUP_CONCAT(DISTINCT CONCAT('
           LEFT JOIN (
             SELECT Owner, value AS `',REPLACE(Attribute,'`','``'),'`
             FROM   my_table
             WHERE  Attribute=',QUOTE(Attribute),'
           ) AS `t_',REPLACE(Attribute,'`','``'),'` USING (Owner)
         ') SEPARATOR ''), '
         ORDER BY Owner, Rule_No
       ') INTO @sql
FROM   my_table;

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

在sqlfiddle上查看。

Answer 2

好的，首先在我写其他内容之前：这个查询只能在一个SQL select中完成，但我不会推荐它。它可能适用于这个小样本表，但对于大型表来说它不是一个现实的解决方案，它可以通过使用存储过程以更好（更快，更清洁）的方式解决。

另外，我没有完全完成它，因为它不是凌晨2点10分，而且我已经有几个小时的工作了 - 不考虑是太多的挑战，但剩下的就是部件只是基于现有查询的复制粘贴SQL重写。

我在pastebin

上发布了包含示例数据的思维过程

基本过程是：

计算所有者的可能排列（N）
构造一个SQL查询，它从1 ..（N * number_of_attributes）
1. 根据N
2. 根据N

此算法是任意数量的属性或值的通用解决方案。

Answer 3

这是fthiella对SQL Server（NOT FINAL）的回答：

If  Object_ID('tempdb..#test') Is Not Null Drop Table #test;

Select '10' As Owner,'COLOR' Attribute,'BLUE' Value Into #test
Union
Select '10','COLOR','RED'
Union
Select '10','COLOR','GREEN'
Union
Select '10','SIZE','BIG'
Union
Select '20','a','1'
Union
Select '20','a','2'
Union
Select '20','b','111'
Union
Select '20','b','222'
Union
Select '20','COLOR','GREEN'
Union
Select '20','SIZE','MEDIUM'
Union
Select '20','MEMORY','16G'
Union
Select '20','MEMORY','32G'
Union
Select '30','COLOR','RED'
Union
Select '30','COLOR','BLUE'
Union
Select '30','MEMORY','64G';



Select 
    Owner, Attribute, Value,
    RuleNo = Row_Number() Over (Partition By Owner, Attribute Order By Owner, Attribute)
From
    (Select Base.Owner, Base.Attribute, Base.Value
    From
        #Test As Base
        Inner Join
            (Select Owner, Attribute
             From #Test
             Group By Owner, Attribute
             Having Count(*) > 1) As MultipleValue
        On Base.Owner = MultipleValue.Owner
        And Base.Attribute = MultipleValue.Attribute
        Union All
        Select Sing.Owner, Sing.Attribute, Sing.Value
        From
            (Select Owner, Attribute, Value = Min(Value)
            From #Test
            Group by Owner, Attribute
            Having Count(*) = 1) As Sing
        Inner Join
            (Select Owner, Attribute
            From #Test
            Group by Owner, Attribute
            Having Count(*) > 1) As Mult
            On Sing.Owner = Mult.Owner
        Inner Join #Test As Comp
        On Mult.Owner = Comp.Owner And Mult.Attribute = Comp.Attribute) As Vals
Order By 
    Owner, RuleNo, Attribute, Value

Answer 4

我给了他一个镜头（花了太多时间）。以为我有一个解决方案 - 它为给定的数据产生预期的结果（不完全，但我认为可以接受）。不幸的是，当添加更多数据时，它无法阻止。

也许其他人可以找到基于此的工作解决方案。

SELECT DISTINCT a.`owner`, a.`attribute`, a.`value`, a.`index` * b.`index` AS `Rule_No`
FROM (
  SELECT `owner`, `attribute`, `value`,  
    IF(
      `owner` = @_owner AND `attribute` = @_attribute,
      @_row := @_row + 1,
      @_row := 1 AND (@_owner := `owner`) AND (@_attribute := `attribute`)
    ) + 1 AS `index`
  FROM `attributes`, (SELECT @_owner := '', @_attribute := '', @_row := 0) x
  ORDER BY `owner`, `attribute`
  ) a
INNER JOIN (
  SELECT `owner`, `attribute`, `value`,  
    IF(
      `owner` = @_owner AND `attribute` = @_attribute,
      @_row := @_row + 1,
      @_row := 1 AND (@_owner := `owner`) AND (@_attribute := `attribute`)
    ) + 1 AS `index`
  FROM `attributes`, (SELECT @_owner := '', @_attribute := '', @_row := 0) x
  ORDER BY `owner`, `attribute`
  ) b
ON a.`owner` = b.`owner` AND a.`attribute` <> b.`attribute`
ORDER BY `owner`, `Rule_No`, `attribute`, `value`

SQLFiddle - Working

SQLFiddle - Broken (More Data Added)

Answer 5

虽然这还远未完成，但这是我能做的最好的事情。也许它会给别人一个想法？它特别为此数据集以错误的顺序获取正确的行计数。

select a.owner, a.attribute, a.value
from test1 a
    join (
        select owner, attribute, count(distinct attribute, value) - 1 as total
        from test1
        group by owner, attribute
    ) b
        on a.owner = b.owner
            and a.attribute = b.attribute
    join (
        select owner, max(total) as total from (
            select owner, attribute, count(distinct attribute, value) as total
            from test1
            group by owner, attribute
        ) t group by owner
    ) c
        on a.owner = c.owner
    join (
        select @rownum:=@rownum+1 as num
        from test1,
            (select @rownum:=0 from dual) r
    ) temp
        on num <= c.total - b.total
order by a.owner asc
;

Sql组合从单个表到单个表

5 个答案: