我想通过两个新列返回列中所有可能的值组合。例如。我的列由值(A,B,C,D)组成。这些值的可能组合是(A,B),(A,C),(A,D),(B,C),(B,D),(C,D),(A,B,C) ,(B,D,C),(D,C,A),(C,A,B)[备注:我不想考虑(1)只有一个值的组合,(2)与所有值和(3)没有值的组合。因此,对于n个不同的值,我有2 ^(n)-n-1-1个组合]。我想在两列中列出所有这些组合,如下所示。
请考虑我从这一栏开始:
Col0
----
A
B
C
D
在Col0之外我想用两列产生10种组合:
Col1 Col2
---- ----
1 A
1 B
2 A
2 C
3 A
3 D
4 B
4 C
5 B
5 D
6 C
6 C
7 A
7 B
7 C
8 B
8 C
8 D
9 C
9 D
9 A
10 D
10 A
10 B
我如何在SQL中执行此操作?我使用SQLite。
非常感谢你!
答案 0 :(得分:2)
我有一个解决方案,但它需要两次更改......
id
(从1开始)
id | datum
----+-------
1 | A
2 | B
3 | C
4 | D
(我计算的输出id
是每个排列的有效标识符,但我不输出您不会输入的排列对...感兴趣
group_id | datum
----------+-------
6 | A
6 | B
7 | A
7 | C
8 | A
8 | D
12 | B
12 | C
13 | B
13 | D
18 | C
18 | D
32 | A
32 | B
32 | C
33 | A
33 | B
33 | D
38 | A
38 | C
38 | D
63 | B
63 | C
63 | D
http://dbfiddle.uk/?rdbms=sqlite_3.8&fiddle=87d670ecaba8b735cb3f95fa66cea96b
http://dbfiddle.uk/?rdbms=sqlite_3.8&fiddle=26e4f59874009ef95367d85565563c3c
WITH
cascade AS
(
SELECT
1 AS depth,
NULL AS parent_id,
id,
datum,
id AS datum_id
FROM
sample
UNION ALL
SELECT
parent.depth + 1,
parent.id,
parent.id * (SELECT MAX(id)+1 FROM sample) + child.id - 1,
child.datum,
child.id
FROM
cascade AS parent
INNER JOIN
sample AS child
ON child.id > parent.datum_id
),
travelled AS
(
SELECT
depth AS depth,
parent_id AS parent_id,
id AS group_id,
datum AS datum,
datum_id AS datum_id
FROM
cascade
WHERE
depth NOT IN (1, (SELECT COUNT(*) FROM sample))
UNION ALL
SELECT
parent.depth,
parent.parent_id,
child.group_id,
parent.datum,
parent.datum_id
FROM
travelled AS child
INNER JOIN
cascade AS parent
ON parent.id = child.parent_id
)
SELECT
group_id,
datum
FROM
travelled
ORDER BY
group_id,
datum_id
第一个CTE遍历所有可用组合(递归)创建有向图。在这个阶段,我不排除一个项目或所有项目的组合,但我确实排除了相同的排列。
每个节点还有一个为其计算的唯一标识符。这些id
中存在差距,因为计算也适用于所有排列,即使它们并非都包含在内。
获取该图中的任何节点并向上走到最后的父节点(再次递归)将始终提供与从图中的其他节点开始时不同的组合。
所以第二个CTE会完成所有这些步行,不包括"只有一个项目"和"所有项目"。
最终选择只是按顺序输出结果。
id
中的差距可能是可以避免的,但是在一个工作日结束时我的头脑数学太难了。
答案 1 :(得分:1)
如果窗口功能和CTE可用,则可以使用以下方法
with data_rn as
(
select d1.col0 col1,
d2.col0 col2,
row_number() over (order by d1.col0) rn
from data d1
inner join data d2 on d1.col0 > d2.col0
)
select rn, col1 from data_rn
union all
select rn, col2 from data_rn
order by rn
答案 2 :(得分:1)
想法是枚举幂集,通过为每个值赋值2的幂,然后从 1 迭代到 2 ^ n - 1 ,并过滤元素设置了相应的位。
-- map each value with a power of 2 : 1, 2, 4, 8, 16
with recursive ELEMENTS(IDX, POW, VAL) as (
-- init with dummy values
values(-1, 0.5, null)
union all
select IDX + 1,
POW * 2,
-- index the ordered values from 0 to N - 1
( select COL0
from DATA d1
where (select count(*) from DATA d2 where d2.COL0 < d1.COL0) = IDX + 1)
from ELEMENTS
where IDX + 1 < (select count(*) from data)
), POWER_SETS(ITER, VAL, POW) as (
select 1, VAL, POW from ELEMENTS where VAL is not null
union all
select ITER + 1, VAL, POW
from POWER_SETS
where ITER < (select SUM(POW) from elements) )
select ITER, VAL from POWER_SETS
-- only if the value's bit is set
where ITER & POW != 0
编辑:第二版,在MatBailie的帮助下。只有一个CTE是递归的,并且排除了单例子集。
WITH RECURSIVE
-- number the values
elements(val, idx) AS (
SELECT d1.col0, (select count(*) from DATA d2 where d2.COL0 < d1.COL0)
FROM DATA d1
),
-- iterate from 3 (1 and 2 are singletons)
-- to 2^n - 1 (subset containing all the elements)
subsets(iter) AS (
VALUES(3)
UNION ALL
SELECT iter + 1
from subsets
WHERE iter < (1 << (SELECT COUNT(*) FROM elements)) - 1
)
SELECT iter AS Col1, val AS Col2
FROM elements
CROSS JOIN subsets
-- the element is present is this subset (the bit is set)
WHERE iter & (1 << idx) != 0
-- exclude singletons (another idea from MatBailie)
AND iter != (iter & -iter)
ORDER BY iter, val