SQL - 通过两个新列返回列中所有可能的值组合

时间:2018-02-14 14:09:57

标签: sql sqlite combinations

我想通过两个新列返回列中所有可能的值组合。例如。我的列由值(A,B,C,D)组成。这些值的可能组合是(A,B),(A,C),(A,D),(B,C),(B,D),(C,D),(A,B,C) ,(B,D,C),(D,C,A),(C,A,B)[备注:我不想考虑(1)只有一个值的组合,(2)与所有值和(3)没有值的组合。因此,对于n个不同的值,我有2 ^(n)-n-1-1个组合]。我想在两列中列出所有这些组合,如下所示。

请考虑我从这一栏开始:

Col0
----
A
B
C
D

在Col0之外我想用两列产生10种组合:

Col1 Col2
---- ----
1    A
1    B
2    A
2    C
3    A
3    D
4    B
4    C
5    B
5    D
6    C
6    C
7    A
7    B
7    C
8    B
8    C
8    D
9    C
9    D
9    A
10   D
10   A
10   B

我如何在SQL中执行此操作?我使用SQLite。

非常感谢你!

3 个答案:

答案 0 :(得分:2)

我有一个解决方案,但它需要两次更改......

  1. 必须为每件商品提供id (从1开始)
  2. 输出ID可能不是顺序的

  3.  id | datum
    ----+-------
      1 |   A
      2 |   B
      3 |   C
      4 |   D
    

    (我计算的输出id是每个排列的有效标识符,但我不输出您不会输入的排列对...感兴趣

     group_id | datum
    ----------+-------
        6     |   A
        6     |   B
    
        7     |   A
        7     |   C
    
        8     |   A
        8     |   D
    
        12    |   B
        12    |   C
    
        13    |   B
        13    |   D
    
        18    |   C
        18    |   D
    
        32    |   A
        32    |   B
        32    |   C
    
        33    |   A
        33    |   B
        33    |   D
    
        38    |   A
        38    |   C
        38    |   D
    
        63    |   B
        63    |   C
        63    |   D
    


    http://dbfiddle.uk/?rdbms=sqlite_3.8&fiddle=87d670ecaba8b735cb3f95fa66cea96b

    http://dbfiddle.uk/?rdbms=sqlite_3.8&fiddle=26e4f59874009ef95367d85565563c3c

    WITH
      cascade AS
    (
      SELECT
        1          AS depth,
        NULL       AS parent_id,
        id,
        datum,
        id         AS datum_id
      FROM
        sample
    
      UNION ALL
    
      SELECT
        parent.depth + 1,
        parent.id,
        parent.id * (SELECT MAX(id)+1 FROM sample) + child.id - 1,
        child.datum,
        child.id
      FROM
        cascade  AS parent
      INNER JOIN
        sample   AS child
          ON child.id > parent.datum_id
    ),
      travelled AS
    (
        SELECT
          depth       AS depth,
          parent_id   AS parent_id,
          id          AS group_id,
          datum       AS datum,
          datum_id    AS datum_id
        FROM
          cascade
        WHERE
           depth NOT IN (1, (SELECT COUNT(*) FROM sample))
    
        UNION ALL
    
        SELECT
          parent.depth,
          parent.parent_id,
          child.group_id,
          parent.datum,
          parent.datum_id
        FROM
          travelled   AS child
        INNER JOIN
          cascade     AS parent
            ON parent.id = child.parent_id
    )
    SELECT
      group_id,
      datum
    FROM
      travelled
    ORDER BY
      group_id,
      datum_id
    

    第一个CTE遍历所有可用组合(递归)创建有向图。在这个阶段,我不排除一个项目或所有项目的组合,但我确实排除了相同的排列。

    每个节点还有一个为其计算的唯一标识符。这些id中存在差距,因为计算也适用于所有排列,即使它们并非都包含在内。

    获取该图中的任何节点并向上走到最后的父节点(再次递归)将始终提供与从图中的其他节点开始时不同的组合。

    所以第二个CTE会完成所有这些步行,不包括"只有一个项目"和"所有项目"。

    最终选择只是按顺序输出结果。

    id中的差距可能是可以避免的,但是在一个工作日结束时我的头脑数学太难了。

答案 1 :(得分:1)

如果窗口功能和CTE可用,则可以使用以下方法

with data_rn as
(
    select d1.col0 col1, 
           d2.col0 col2, 
           row_number() over (order by d1.col0) rn
    from data d1
    inner join data d2 on d1.col0 > d2.col0
)
select rn, col1 from data_rn
union all
select rn, col2 from data_rn
order by rn

dbfiddle demo

答案 2 :(得分:1)

想法是枚举幂集,通过为每个值赋值2的幂,然后从 1 迭代到 2 ^ n - 1 ,并过滤元素设置了相应的位。

-- map each value with a power of 2 : 1, 2, 4, 8, 16
with recursive ELEMENTS(IDX, POW, VAL) as (
  -- init with dummy values 
  values(-1, 0.5, null)
  union all
  select IDX + 1,
    POW * 2,
    -- index the ordered values from 0 to N - 1
    ( select COL0 
      from DATA d1 
      where (select count(*) from DATA d2 where d2.COL0 < d1.COL0) = IDX + 1)
  from ELEMENTS 
  where IDX + 1 < (select count(*) from data)
), POWER_SETS(ITER, VAL, POW) as (
  select 1, VAL, POW from ELEMENTS where VAL is not null
  union all
  select ITER + 1, VAL, POW
  from POWER_SETS
  where ITER < (select SUM(POW) from elements) )
select ITER, VAL from POWER_SETS
-- only if the value's bit is set
where ITER & POW != 0

编辑:第二版,在MatBailie的帮助下。只有一个CTE是递归的,并且排除了单例子集。

WITH RECURSIVE
  -- number the values
  elements(val, idx) AS (
    SELECT d1.col0, (select count(*) from DATA d2 where d2.COL0 < d1.COL0)
    FROM DATA d1
  ), 
  -- iterate from 3 (1 and 2 are singletons) 
  -- to 2^n - 1 (subset containing all the elements)
  subsets(iter) AS (
    VALUES(3)
    UNION ALL
    SELECT iter + 1
    from subsets
    WHERE iter < (1 << (SELECT COUNT(*) FROM elements)) - 1
  )
SELECT iter AS Col1, val AS Col2
FROM elements
CROSS JOIN subsets
-- the element is present is this subset (the bit is set)
WHERE iter & (1 << idx) != 0
-- exclude singletons (another idea from MatBailie)
AND iter != (iter & -iter)
ORDER BY iter, val