Question

我有两个表，X和Y，具有相同的架构但不同的记录。给定来自X的记录，我需要一个查询来查找Y中最接近的匹配记录，其中包含非匹配列的NULL值。应从比较中排除标识列。例如，如果我的记录如下所示：

------------------------
id | col1 | col2 | col3
------------------------
0  |'abc' |'def' | 'ghi'

表Y看起来像这样：

------------------------
id | col1 | col2 | col3
------------------------
6  |'abc' |'def' | 'zzz'
8  | NULL |'def' | NULL

然后最接近的匹配将是记录8，因为列不匹配，所以存在NULL值。 6最接近的比赛，但'zzz'取消了它的资格。

这个问题的独特之处在于除了id列和数据类型之外，表的模式是未知的。可能有4列，或者可能有7列。我们只是不知道 - 这是动态的。我们所知道的是，将会有一个'id'列，并且列将是字符串，varchar或nvarchar。

在这种情况下，从X中选择最接近的匹配记录的最佳查询是什么？我其实是在写一个函数。输入是整数（X中记录的id），输出是整数（Y中记录的id，或NULL）。我是一名SQL新手，所以对你的解决方案中发生的事情的简要解释对我有很大的帮助。

Answer 1

可能有4列，或者可能有7列......我实际上是在编写一个函数。

这是一项不可能完成的任务。因为函数是确定性的，所以你不能使用动态SQL来处理任意表结构的函数。存储过程，当然，但不是函数。

但是，下面显示了使用FOR XML和一些XML分解方法，以便将行拆分为列名和值，然后可以进行比较。这里使用的技术和查询可以合并到存储过程中。

MS SQL Server 2008架构设置：

-- this is the data table to match against
create table t1 (
    id int,
    col1 varchar(10),
    col2 varchar(20),
    col3 nvarchar(40));
insert t1
select 6, 'abc', 'def', 'zzz' union all
select 8, null , 'def', null;

-- this is the data with the row you want to match
create table t2 (
    id int,
    col1 varchar(10),
    col2 varchar(20),
    col3 nvarchar(40));
insert t2
select 0, 'abc', 'def', 'ghi';
GO

查询1 ：

;with unpivoted1 as (
    select n.n.value('local-name(.)','nvarchar(max)') colname,
           n.n.value('.','nvarchar(max)') value
    from (select (select * from t2 where id=0 for xml path(''), type)) x(xml)
    cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
), unpivoted2 as (
    select x.id,
           n.n.value('local-name(.)','nvarchar(max)') colname,
           n.n.value('.','nvarchar(max)') value
    from (select id,(select * from t1 where id=outr.id for xml path(''), type) from t1 outr) x(id,xml)
    cross apply x.xml.nodes('//*[local-name()!="id"]') n(n)
)
select TOP(1) WITH TIES
       B.id,
       sum(case when A.value=B.value then 1 else 0 end) matches
from unpivoted1 A
join unpivoted2 B on A.colname = B.colname
group by B.id
having max(case when A.value <> B.value then 1 end) is null
ORDER BY matches;

<强> Results ：

| ID | MATCHES |
----------------
|  8 |       1 |

查询查找具有最匹配列的记录，其中列数和列名称未知？

1 个答案: