我有一张表如下:
表格提取
Owner | Attribute | value
----------------------------------------------------
10 | COLOR | BLUE
10 | COLOR | RED
10 | COLOR | GREEN
10 | SIZE | BIG
20 | COLOR | GREEN
20 | SIZE | MEDIUM
20 | MEMORY | 16G
20 | MEMORY | 32G
30 | COLOR | RED
30 | COLOR | BLUE
30 | MEMORY | 64G
是否有SQL将使用单个索引计算所有属性的组合(结果中的最后一列):
Owner | Attribute | Value | Rule_No
10 | COLOR | BLUE | 1
10 | SIZE | BIG | 1
10 | COLOR | RED | 2
10 | SIZE | BIG | 2
10 | COLOR | GREEN | 3
10 | SIZE | BIG | 3
20 | COLOR | GREEN | 1
20 | SIZE | MEDIUM| 1
20 | MEMORY | 16G | 1
20 | COLOR | GREEN | 2
20 | SIZE | MEDIUM| 2
20 | MEMORY | 32G | 2
30 | COLOR | BLUE | 1
30 | MEMORY | 64G | 1
30 | COLOR | RED | 2
30 | MEMORY | 64G | 2
规则编号对每个所有者而言是唯一的(所有者'10'的规则'1'与所有者'20'的规则'1'无关。
我尝试使用SQL交叉连接,但属性数量不固定,然后我不能使用它(每个属性需要一个交叉连接),我希望组合是新行而不是新列。
我正在尝试使用Talend Open Studio - Data Integration
来执行此操作,但仅使用SQL的解决方案对我来说会更好。
答案 0 :(得分:6)
你真的想要问题中给出的表单中的数据(这需要Rule_No
上的进一步聚合才能在最有可能的情况下使用),或者你最终是否正在寻求转动它?也就是说,规则连接在一起(每个属性成为自己的列),如下所示:
+---------+-------+-------+--------+--------+ | Rule_No | Owner | COLOR | SIZE | MEMORY | +---------+-------+-------+--------+--------+ | 1 | 10 | BLUE | BIG | NULL | | 2 | 10 | RED | BIG | NULL | | 3 | 10 | GREEN | BIG | NULL | | 1 | 20 | GREEN | MEDIUM | 16G | | 2 | 20 | GREEN | MEDIUM | 32G | | 1 | 30 | RED | NULL | 64G | | 2 | 30 | BLUE | NULL | 64G | +---------+-------+-------+--------+--------+
可以使用以下查询来转移此类数据:
SELECT @t:=IF(Owner=@o,@t,0)+1 AS Rule_No,
@o:=Owner AS Owner,
`COLOR`,`SIZE`,`MEMORY`
FROM (SELECT DISTINCT Owner, @t:=0 FROM my_table) t0
LEFT JOIN (
SELECT Owner, value AS `COLOR`
FROM my_table
WHERE Attribute='COLOR'
) AS `t_COLOR` USING (Owner)
LEFT JOIN (
SELECT Owner, value AS `SIZE`
FROM my_table
WHERE Attribute='SIZE'
) AS `t_SIZE` USING (Owner)
LEFT JOIN (
SELECT Owner, value AS `MEMORY`
FROM my_table
WHERE Attribute='MEMORY'
) AS `t_MEMORY` USING (Owner)
ORDER BY Owner, Rule_No
由于属性列表是动态的,因此可以使用查询来构造上述SQL,从中准备并执行语句:
SELECT CONCAT('
SELECT @t:=IF(Owner=@o,@t,0)+1 AS Rule_No,
@o:=Owner AS Owner,
', GROUP_CONCAT(DISTINCT CONCAT(
'`',REPLACE(Attribute,'`','``'),'`'
)), '
FROM (SELECT DISTINCT Owner, @t:=0 FROM my_table) t0
', GROUP_CONCAT(DISTINCT CONCAT('
LEFT JOIN (
SELECT Owner, value AS `',REPLACE(Attribute,'`','``'),'`
FROM my_table
WHERE Attribute=',QUOTE(Attribute),'
) AS `t_',REPLACE(Attribute,'`','``'),'` USING (Owner)
') SEPARATOR ''), '
ORDER BY Owner, Rule_No
') INTO @sql
FROM my_table;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
在sqlfiddle上查看。
答案 1 :(得分:2)
好的,首先在我写其他内容之前:这个查询只能在一个SQL select中完成,但我不会推荐它。它可能适用于这个小样本表,但对于大型表来说它不是一个现实的解决方案,它可以通过使用存储过程以更好(更快,更清洁)的方式解决。
另外,我没有完全完成它,因为它不是凌晨2点10分,而且我已经有几个小时的工作了 - 不考虑是太多的挑战,但剩下的就是部件只是基于现有查询的复制粘贴SQL重写。
我在pastebin
上发布了包含示例数据的思维过程基本过程是:
此算法是任意数量的属性或值的通用解决方案。
答案 2 :(得分:0)
这是fthiella对SQL Server(NOT FINAL)的回答:
If Object_ID('tempdb..#test') Is Not Null Drop Table #test;
Select '10' As Owner,'COLOR' Attribute,'BLUE' Value Into #test
Union
Select '10','COLOR','RED'
Union
Select '10','COLOR','GREEN'
Union
Select '10','SIZE','BIG'
Union
Select '20','a','1'
Union
Select '20','a','2'
Union
Select '20','b','111'
Union
Select '20','b','222'
Union
Select '20','COLOR','GREEN'
Union
Select '20','SIZE','MEDIUM'
Union
Select '20','MEMORY','16G'
Union
Select '20','MEMORY','32G'
Union
Select '30','COLOR','RED'
Union
Select '30','COLOR','BLUE'
Union
Select '30','MEMORY','64G';
Select
Owner, Attribute, Value,
RuleNo = Row_Number() Over (Partition By Owner, Attribute Order By Owner, Attribute)
From
(Select Base.Owner, Base.Attribute, Base.Value
From
#Test As Base
Inner Join
(Select Owner, Attribute
From #Test
Group By Owner, Attribute
Having Count(*) > 1) As MultipleValue
On Base.Owner = MultipleValue.Owner
And Base.Attribute = MultipleValue.Attribute
Union All
Select Sing.Owner, Sing.Attribute, Sing.Value
From
(Select Owner, Attribute, Value = Min(Value)
From #Test
Group by Owner, Attribute
Having Count(*) = 1) As Sing
Inner Join
(Select Owner, Attribute
From #Test
Group by Owner, Attribute
Having Count(*) > 1) As Mult
On Sing.Owner = Mult.Owner
Inner Join #Test As Comp
On Mult.Owner = Comp.Owner And Mult.Attribute = Comp.Attribute) As Vals
Order By
Owner, RuleNo, Attribute, Value
答案 3 :(得分:0)
我给了他一个镜头(花了太多时间)。以为我有一个解决方案 - 它为给定的数据产生预期的结果(不完全,但我认为可以接受)。不幸的是,当添加更多数据时,它无法阻止。
也许其他人可以找到基于此的工作解决方案。
SELECT DISTINCT a.`owner`, a.`attribute`, a.`value`, a.`index` * b.`index` AS `Rule_No`
FROM (
SELECT `owner`, `attribute`, `value`,
IF(
`owner` = @_owner AND `attribute` = @_attribute,
@_row := @_row + 1,
@_row := 1 AND (@_owner := `owner`) AND (@_attribute := `attribute`)
) + 1 AS `index`
FROM `attributes`, (SELECT @_owner := '', @_attribute := '', @_row := 0) x
ORDER BY `owner`, `attribute`
) a
INNER JOIN (
SELECT `owner`, `attribute`, `value`,
IF(
`owner` = @_owner AND `attribute` = @_attribute,
@_row := @_row + 1,
@_row := 1 AND (@_owner := `owner`) AND (@_attribute := `attribute`)
) + 1 AS `index`
FROM `attributes`, (SELECT @_owner := '', @_attribute := '', @_row := 0) x
ORDER BY `owner`, `attribute`
) b
ON a.`owner` = b.`owner` AND a.`attribute` <> b.`attribute`
ORDER BY `owner`, `Rule_No`, `attribute`, `value`
答案 4 :(得分:0)
虽然这还远未完成,但这是我能做的最好的事情。也许它会给别人一个想法?它特别为此数据集以错误的顺序获取正确的行计数。
select a.owner, a.attribute, a.value
from test1 a
join (
select owner, attribute, count(distinct attribute, value) - 1 as total
from test1
group by owner, attribute
) b
on a.owner = b.owner
and a.attribute = b.attribute
join (
select owner, max(total) as total from (
select owner, attribute, count(distinct attribute, value) as total
from test1
group by owner, attribute
) t group by owner
) c
on a.owner = c.owner
join (
select @rownum:=@rownum+1 as num
from test1,
(select @rownum:=0 from dual) r
) temp
on num <= c.total - b.total
order by a.owner asc
;