T-SQL根据匹配条件/规则匹配两个表

时间:2016-10-26 17:04:50

标签: sql-server tsql

我正在尝试找到一种有效的方法来完成此匹配过程。我不确定基于集合的更新是否可行,或者是否应使用游标。我想将匹配规则放在一个单独的表中,如下所示(这样它们就不必在存储过程中进行硬编码)。这是在SQL Server 2014中。

我有两个表需要根据一组匹配规则/标准进行匹配(在第三个表中)。我正在努力寻找一种有效的方法来实现这一目标。

情况就是这样:我们的付款需要与预订相匹配 - 为了做到这一点,我们将有一套匹配的规则,我们可以通过。我需要根据这些规则匹配表,并使用match_code(用于更新数据的规则)更新支付表中的数据。

付款表:

╔═══════════╦══════════╦══════════════╦════════════╦═══════╦════════════╗
║ FirstName ║ LastName ║ Confirmation ║    Date    ║ Trans ║ Match_code ║
╠═══════════╬══════════╬══════════════╬════════════╬═══════╬════════════╣
║ Scott     ║ Bloom    ║       123456 ║ 2016-01-15 ║       ║            ║
║ Beverly   ║ Smith    ║        65487 ║ 2016-08-16 ║       ║            ║
║ Cindy     ║ Plum     ║       147852 ║ 2016-07-19 ║       ║            ║
╚═══════════╩══════════╩══════════════╩════════════╩═══════╩════════════╝

预订表:

╔═══════════╦══════════╦══════════════╦════════════╦═════════════╗
║ FirstName ║ LastName ║ Confirmation ║    Date    ║ Transaction ║
╠═══════════╬══════════╬══════════════╬════════════╬═════════════╣
║ Alfred    ║ Kim      ║       987456 ║ 2016-11-17 ║       12345 ║
║ Beverly   ║ Smith    ║        65487 ║ 2016-07-14 ║       12346 ║
║ Cindy     ║ Plum     ║        99898 ║ 2016-07-19 ║       12347 ║
╚═══════════╩══════════╩══════════════╩════════════╩═════════════╝

规则表:

╔════════════╦═══════════╦══════════╦══════════════╦══════╗
║ Match_code ║ FirstName ║ LastName ║ Confirmation ║ Date ║
╠════════════╬═══════════╬══════════╬══════════════╬══════╣
║          1 ║         1 ║        1 ║            1 ║    1 ║
║          2 ║         1 ║        1 ║            1 ║    0 ║
║          3 ║         1 ║        1 ║            0 ║    1 ║
╚════════════╩═══════════╩══════════╩══════════════╩══════╝

规则表标识哪些字段需要被视为每个Match_code的匹配

匹配程序应该贯穿并尝试根据匹配规则将付款与预订相匹配。

  • 它会尝试根据匹配代码1进行更新,并发现所有列上都没有完全匹配。
  • 尝试匹配代码2并找到一个匹配项(Beverly匹配第一个,最后一个,确认但不是日期)。
  • 尝试匹配代码3并找到一个匹配(Cindy匹配第一个,最后一个,日期,但不是确认)。

该程序应根据在预订表中找到的匹配更新payments.trans字段和payments.match_code字段。结果将是:

付款表:

╔═══════════╦══════════╦══════════════╦════════════╦═══════╦════════════╗
║ FirstName ║ LastName ║ Confirmation ║    Date    ║ Trans ║ Match_code ║
╠═══════════╬══════════╬══════════════╬════════════╬═══════╬════════════╣
║ Scott     ║ Bloom    ║       123456 ║ 2016-01-15 ║       ║          0 ║
║ Beverly   ║ Smith    ║        65487 ║ 2016-08-16 ║ 12346 ║          2 ║
║ Cindy     ║ Plum     ║       147852 ║ 2016-07-19 ║ 12347 ║          3 ║
╚═══════════╩══════════╩══════════════╩════════════╩═══════╩════════════╝

结果是,我们现在知道每笔付款与哪个预订相匹配以及匹配的匹配码。

如果您对完成此类任务的最佳方式有任何建议,我们将不胜感激 - 提前感谢!

4 个答案:

答案 0 :(得分:1)

假设你的规则是静态的而且从不改变(我知道情况并非如此),我会做类似的事情:

WITH p (paymentID, rule1Key, rule2Key, rule3Key) AS 
(SELECT
paymentID, 
firstname + '|' + lastname + '|' + confirmation + '|' + [DATE] AS rule1Key,
firstname + '|' + lastname + '|' + confirmation  AS rule2Key,
firstname + '|' + lastname + '|' + [DATE] AS rule3Key
FROM
payment),

b (TRANSACTION, rule1Key, rule2Key, rule3Key) AS
(SELECT
[TRANSACTION], 
firstname + '|' + lastname + '|' + confirmation + '|' + [DATE] AS rule1Key,
firstname + '|' + lastname + '|' + confirmation  AS rule2Key,
firstname + '|' + lastname + '|' + [DATE] AS rule3Key
FROM
bookings)

UPDATE p
SET trans = 
    CASE 
        WHEN b1 IS NOT NULL THEN    b1.TRANSACTION
        WHEN b2 IS NOT NULL THEN    b2.TRANSACTION
        WHEN b3 IS NOT NULL THEN    b3.TRANSACTION
    end,
SET match_code = 
    CASE 
        WHEN b1 IS NOT NULL THEN    1
        WHEN b2 IS NOT NULL THEN    2
        WHEN b3 IS NOT NULL THEN    3
    end
FROM
p LEFT OUTER JOIN
b b1 ON
p.rule1key = b1.rule1key LEFT OUTER JOIN
b b2 ON
p.rule2key = b2.rule2key LEFT OUTER JOIN
b b3 ON
p.rule3key = b3.rule3key;

然而,了解你的邪恶用户想要玩规则(他们不喜欢他们当前的限制,他们会要求<,>,以及以后的OR条件,标记我的话!当他们问()你反抗!

然后你需要它们以不同的方式为你构造表,这样你就可以使用动态sql从rules表的内容构建你的查询。它们可以有一个带有matchID和fieldName的2列表,并假设它们始终通过AND条件连接,因此您的规则将如下所示:

matchID   fieldName
1         FirstName 
1         LastName 
1         Confirmation 
1         DATE
2         FirstName 
2         LastName 
2         Confirmation 
3         FirstName 
3         LastName 
3         DATE

希望您能看到如何从这些规则内容中获取您可以生成的查询以进行更新。

答案 1 :(得分:0)

<强> Sql Fiddle Demo

首先,您需要LEFT JOIN使用OR,并使用CASE来确定匹配发生的位置。

SELECT P.[FirstName], P.[LastName], P.[Confirmation], P.[Date], B.[Transaction],
       CASE WHEN P.[FirstName]     = B.[FirstName]    THEN 1 ELSE 0 END as match_1,
       CASE WHEN P.[LastName]      = B.[LastName]     THEN 1 ELSE 0 END as match_2,
       CASE WHEN P.[Confirmation]  = B.[Confirmation] THEN 1 ELSE 0 END as match_3,
       CASE WHEN P.[Date]          = B.[Date]         THEN 1 ELSE 0 END as match_4
FROM Payments P
LEFT JOIN Bookings B
  ON P.[FirstName]     = B.[FirstName]
  OR P.[LastName]      = B.[LastName]
  OR P.[Confirmation]  = B.[Confirmation]
  OR P.[Date]          = B.[Date]  

部分输出: P.* + B.[Transaction] + match.*

enter image description here

然后使用该查询检查遵循的规则

WITH cte as (
    SELECT P.[FirstName], P.[LastName], P.[Confirmation], P.[Date], B.[Transaction],
           CASE WHEN P.[FirstName]     = B.[FirstName]    THEN 1 ELSE 0 END as match_1,
           CASE WHEN P.[LastName]      = B.[LastName]     THEN 1 ELSE 0 END as match_2,
           CASE WHEN P.[Confirmation]  = B.[Confirmation] THEN 1 ELSE 0 END as match_3,
           CASE WHEN P.[Date]          = B.[Date]         THEN 1 ELSE 0 END as match_4
    FROM Payments P
    LEFT JOIN Bookings B
      ON P.[FirstName]     = B.[FirstName]
      OR P.[LastName]      = B.[LastName]
      OR P.[Confirmation]  = B.[Confirmation]
      OR P.[Date]          = B.[Date]  
)
SELECT C.[FirstName], C.[LastName], C.[Confirmation], C.[Date], C.[Transaction], 
      COALESCE([Match_code], 0 ) as [Match_code]
FROM cte C
LEFT JOIN Rules R
   ON C.match_1 = R.[FirstName]
  AND C.match_2 = R.[LastName]
  AND C.match_3 = R.[Confirmation]
  AND C.match_4 = R.[Date]

最终输出

enter image description here

最后的注释

您的付款可能与一个或多个预订相匹配,也可能符合不同的规则。因此,您可能需要使用最后一个结果作为子查询来选择最低[Match_code]

答案 2 :(得分:0)

我同意Beth的观点,即使用2列表(matchID,fieldname)这会更简单。

有许多不同的方法。假设支付和预订表中存在某种主键(id),则可能会执行以下操作:

select payments.id, rules.match_Code, 
SUM(CASE WHEN payments.Firstname = bookings.FirstName 
       and rules.fieldname = 'Firstname' THEN 1 ELSE 0 END +
    CASE WHEN payments.lastname = bookings.lastname 
       and rules.fieldname = 'Lastname' THEN 1 ELSE 0 END +
    CASE WHEN payments.confirmation = bookings.confirmation 
       and rules.fieldname = 'confirmation' THEN 1 ELSE 0 END) AS Matches 
from bookings, payments, rules
group by payments.id, bookings.id, rules.match_Code

通过此,您可以将匹配总数与任何特定匹配代码的规则总数进行比较:

SELECT Match_Code, Count(*) AS Rules 
FROM Rules
GROUP BY Match_Code

如果匹配数等于规则数,您就知道该代码匹配。然后,只需为任何一个付款ID选择最低级别的匹配代码:

FirstName   LastName    match_Code  Matches
Beverly     Smith       1           3
Beverly     Smith       2           3* 
Beverly     Smith       3           2
Cindy       Plum        1           3
Cindy       Plum        2           2
Cindy       Plum        3           3* 

我知道这不是真正的动态,我不了解性能问题,但根据需要将字段添加到CASE WHEN语句并且仍然可以更改规则非常简单按需定义。

答案 3 :(得分:0)

更新了无环替代方案!

这回答了这个问题,但这并不意味着它是一个好主意。

请注意,它应该包含在合适的事务中,以确保PaymentsBookings表的更新不会在此混乱运行时引起任何混淆。

按顺序遍历规则,尝试查找匹配并相应地更新Payments。可以修改它以使用动态SQL执行update,以便查询优化器可以提高性能。

-- Sample data.
declare @Payments as Table ( FirstName VarChar(10), LastName VarChar(10), PaymentDate Date, Confirmation Int, TransactionId Int, MatchCode Int );
insert into @Payments ( FirstName, LastName, Confirmation, PaymentDate, TransactionId, MatchCode ) values
  ( 'Scott', 'Bloom', 123456, '20160115', NULL, NULL ),
  ( 'Beverly', 'Smith', 65487, '20160816', NULL, NULL ),
  ( 'Cindy', 'Plum', 147852, '20160719', NULL, NULL );
select * from @Payments;

declare @Bookings as Table ( FirstName VarChar(10), LastName VarChar(10), PaymentDate Date, Confirmation Int, TransactionId Int );
insert into @Bookings ( FirstName, LastName, Confirmation, PaymentDate, TransactionId ) values
  ( 'Alfred', 'Kim', 987456, '20161117', 12345 ),
  ( 'Beverly', 'Smith', 65487, '20160714', 12346 ),
  ( 'Cindy', 'Plum', 99898, '20160719', 12347 );
select * from @Bookings;

declare @Rules as Table ( MatchCode Int, FirstName Bit, LastName Bit, Confirmation Bit, PaymentDate Bit );
insert into @Rules ( MatchCode, FirstName, LastName, Confirmation, PaymentDate ) values
  ( 1, 1, 1, 1, 1 ),
  ( 2, 1, 1, 1, 0 ),
  ( 3, 1, 1, 0, 1 );
select * from @Rules;

-- Process the data.
declare @MatchCode Int, @FirstName Bit, @LastName Bit, @Confirmation Bit, @PaymentDate Bit;
declare MatchPattern Cursor for
  select MatchCode, FirstName, LastName, Confirmation, PaymentDate
    from @Rules
    order by MatchCode;
open MatchPattern;
fetch next from MatchPattern into @MatchCode, @FirstName, @LastName, @Confirmation, @PaymentDate;
-- Apply the matching rules in order.
while @@Fetch_Status = 0
  begin
  update @Payments
    set MatchCode = @MatchCode, TransactionId = B.TransactionId
    from @Payments as P inner join
      @Bookings as B on
        ( B.FirstName = P.FirstName or @FirstName = 0 ) and
        ( B.LastName = P.LastName or @LastName = 0 ) and
        ( B.Confirmation = P.Confirmation or @Confirmation = 0 ) and
        ( B.PaymentDate = P.PaymentDate or @PaymentDate = 0 )
    where P.MatchCode is NULL -- Skip any rows which have already been matched.
  fetch next from MatchPattern into @MatchCode, @FirstName, @LastName, @Confirmation, @PaymentDate;
  end;
close MatchPattern;
deallocate MatchPattern;

-- Handle any   Payments   that were not matched.
update @Payments
  set MatchCode = 0
  where MatchCode is NULL;

-- Display the result.
select * from @Payments;

或者,如果您认为cross join会提高性能,因为所有游标都是邪恶的。这是SQL Super Collider方法:尝试对Payments粉碎Bookings并查看结果是否恰好与Rules中的任何一个匹配。

-- Sample data.
declare @Payments as Table ( FirstName VarChar(10), LastName VarChar(10), PaymentDate Date, Confirmation Int, TransactionId Int, MatchCode Int );
insert into @Payments ( FirstName, LastName, Confirmation, PaymentDate, TransactionId, MatchCode ) values
  ( 'Scott', 'Bloom', 123456, '20160115', NULL, NULL ),
  ( 'Beverly', 'Smith', 65487, '20160816', NULL, NULL ),
  ( 'Cindy', 'Plum', 147852, '20160719', NULL, NULL );
select * from @Payments;

declare @Bookings as Table ( FirstName VarChar(10), LastName VarChar(10), PaymentDate Date, Confirmation Int, TransactionId Int );
insert into @Bookings ( FirstName, LastName, Confirmation, PaymentDate, TransactionId ) values
  ( 'Alfred', 'Kim', 987456, '20161117', 12345 ),
  ( 'Beverly', 'Smith', 65487, '20160714', 12346 ),
  ( 'Cindy', 'Plum', 99898, '20160719', 12347 );
select * from @Bookings;

declare @Rules as Table ( MatchCode Int, FirstName Bit, LastName Bit, Confirmation Bit, PaymentDate Bit );
insert into @Rules ( MatchCode, FirstName, LastName, Confirmation, PaymentDate ) values
  ( 1, 1, 1, 1, 1 ),
  ( 2, 1, 1, 1, 0 ),
  ( 3, 1, 1, 0, 1 );
select * from @Rules;

-- Process the data.
declare @True as Bit = 1, @False as Bit = 0;
-- The following is based on the assumption that   Confirmation   uniquely identifies   Payments .
update @Payments
  set TransactionId = NewTransactionId, MatchCode = NewMatchCode from
  @Payments as P inner join (
    select P.Confirmation, Min( B.TransactionId ) as NewTransactionId, Coalesce( Min( R.MatchCode ), 0 ) as NewMatchCode
      from @Payments as P cross join
        @Bookings as B left outer join
        @Rules as R on
          R.FirstName = case when P.FirstName = B.FirstName then @True else @False end and
          R.LastName = case when P.LastName = B.LastName then @True else @False end and
          R.Confirmation = case when P.Confirmation = B.Confirmation then @True else @False end and
          R.PaymentDate = case when P.PaymentDate = B.PaymentDate then @True else @False end
      where P.MatchCode is NULL
      group by P.Confirmation ) as C on C.Confirmation = P.Confirmation;

-- Display the result.
select * from @Payments;