我想通过两个单词的组合来创建一个庞大的列表。字典中的每个单词都应与所有其他单词组合。换句话说,我将有Total^2
个组合。每个单词都有一个唯一的ID,我只在组合表中产生两个ID的组合。
我想定期检查是否缺少任何组合,以便生成并添加到数据库中。我发现this Q/A会生成所有可能的组合,但我不知道如何使用SQL查询来查找不存在的组合,例如:
select * from words a ... join words b
where (a.id, b.id) not in (select * from combinaions)
如果SQL对此没有直接解决方案,请您提出一种算法来以编程方式执行此操作。请注意,由于我删除了一些单词,所以可能缺少一些 ID ,所以我不能对整数使用线性循环。
组合表具有两列(第一个ID,第二个ID),两个ID均来自表单词
答案 0 :(得分:4)
您可以使用交叉连接具有所有可能的组合,然后根据条件,可以删除已经存在的组合。
Select * from words a cross join words b
where not exists (select * from combinations c where c.first_id = a.id and c.second_id = b.id)
答案 1 :(得分:1)
正如@VahiD所指出的,CROSS JOIN
是难题的中心部分。除了使用子查询,您还可以将现有组合表LEFT JOIN
替换为单词CROSS JOIN
,并检查NULL
(这意味着给定的笛卡尔积组合不会存在于您现有的组合表中。
例如:
WITH
-- sample data (notice that there's no word with ID of 3)
words(word_id, word) AS
(
SELECT 1, 'apple' UNION ALL
SELECT 2, 'pear' UNION ALL
SELECT 4, 'orange' UNION ALL
SELECT 5, 'banana'
)
-- existing combinations
,combinations(first_id, second_id) AS
(
SELECT 1, 2 UNION ALL
SELECT 1, 5 UNION ALL
SELECT 2, 4 UNION ALL
SELECT 2, 5 UNION ALL
SELECT 4, 5
)
-- this is the CTE you'll use to create the cartesian product
-- of all words in your words table. You can also put this as a
-- sub-query, but I'd argue that a CTE makes it clearer.
,cartesian(w1_id, w1_word, w2_id, w2_word) AS
(
SELECT *
FROM words w1, words w2
)
-- the actual query
SELECT *
FROM cartesian
LEFT JOIN combinations ON
combinations.first_id = cartesian.w1_id
AND combinations.second_id = cartesian.w2_id
WHERE combinations.first_id IS NULL
现在,一个重要的警告是,在切换word1
和word2
时,此查询不会认为组合是相同的。也就是说,(1,2)
与(2,1)
不同。但是,解决此问题就像调整联接一样简单:
SELECT *
FROM cartesian
LEFT JOIN combinations ON
(combinations.first_id = cartesian.w1_id OR combinations.first_id = cartesian.w2_id)
AND
(combinations.second_id = cartesian.w1_id OR combinations.second_id = cartesian.w2_id)
WHERE combinations.first_id IS NULL
答案 2 :(得分:1)
这是另一种选择。在子查询中建立完整列表,然后将其保留在组合表的外部,以查找缺失的内容。
DECLARE @Words TABLE
(
[Id] INT
, [Word] NVARCHAR(200)
);
DECLARE @WordCombo TABLE
(
[Id1] INT
, [Id2] INT
);
INSERT INTO @Words (
[Id]
, [Word]
)
VALUES ( 1, N'Cat' )
, ( 2, N'Taco' )
, ( 3, N'Test' )
, ( 4, N'Cake' )
, ( 5, N'Apple' )
, ( 6, N'Pear' );
INSERT INTO @WordCombo (
[Id1]
, [Id2]
)
VALUES ( 1, 2 )
, ( 2, 6 )
, ( 5, 3 )
, ( 5, 1 );
--select from a sub query that builds out all combinations and then left outer to find what's missing in @WordCombo
SELECT [fulllist].[Id1]
, [fulllist].[Id2]
FROM (
--Rebuild full list
SELECT [a].[Id] AS [Id1]
, [b].[Id] AS [Id2]
FROM @Words [a]
INNER JOIN @Words [b]
ON 1 = 1
WHERE [a].[Id] <> [b].[Id] --Would a word be combined with itself?
) AS [fulllist]
LEFT OUTER JOIN @WordCombo [wc]
ON [wc].[Id1] = [fulllist].[Id1]
AND [wc].[Id2] = [fulllist].[Id2]
WHERE [wc].[Id1] IS NULL;