Question

我们有2个表格，其中包含英文单词：words_1和words_2 with fields（word as VARCHAR，ref as INT），其中word - 它是一个英文单词，ref - 引用另一个（第三个）表格（它并不重要）。< / p>

在每张表中，所有单词都是唯一的。第一个表包含一些不在第二个表中的单词（相反，第二个表包含一些独特的单词）。

但两张表中的大多数单词都是相同的。

需要获取：包含所有不同字词和参考号的结果表。

初始条件

相同表的参考可以是不同的（字典是从不同的地方加载的）。
每张表中的单词数量为300 000，因此内连接不方便

实施例

words_1
________
Health-1
Car-3
Speed-5

words_2
_________
Health-2
Buty-6
Fast-8
Speed-9

Result table
_____________
Health-1
Car-3
Speed-5
Buty-6
Fast-8

Answer 1

select word,min(ref)
from (
    select word,ref
    from words_1
    union all
    select word,ref
    from words_2
    ) t
group by word

Answer 2

尝试使用full outer join：

select coalesce(w1.word, w2.word) as word, coalesce(w1.ref, w2.ref) as ref
from words_1 w1 full outer join
     words_2 w2
     on w1.word = w2.word;

唯一一次不起作用的是ref可以在任一表中NULL。在这种情况下，请将on更改为：

on w1.word = w2.word and w1.ref is not null and w2.ref is not null

如果您想提高性能，只需在表格上创建一个索引：

create index idx_words1_word_ref on words_1(word, ref);
create index idx_words2_word_ref on words_2(word, ref);

join非常可行，即使没有索引，SQL Server也应该足够聪明，能够提供合理的实现。

大内连接

2 个答案: