Question

我正在尝试从2个单独的表创建一个ANI查找表，一个是商店表，另一个是这些商店的联系人列表。

我正在使用MS SQL Server 2005，遗憾的是，它不支持MERGE INTO语法......

好东西： ANI查找表有2个重要的列，StoreID和PhoneNumber。 PhoneNumber列是（唯一的）主键，因为给定的PhoneNumber必须只返回一个StoreID。

Store_Info重要列：

StoreID  
StorePhone  
AltPhone

每个StoreID都有一条记录，商店之间可能存在重复的电话号码。是的，AltPhone可能与StorePhone相同......

Store_Contacts重要列：

StoreID  
Phone

StoreID有多个条目，一个商店或多个商店可能有重复的电话号码。

样本商店数据

StoreID   Parent ID StorePhone       AltPhone  
1         0         402-123-2300     402-123-2345  
2         0         202-321-7800     202-321-7890  
3         1         202-302-5600     202-302-5600

示例联系人数据：

StoreID   Title    Name    Phone  
1         Mgr      Bob     402-123-2345  
1         IT       Pat     402-123-2346  
1         Reg Mgr  Dave    402-321-3213  
2         Mgr      Ann     202-231-7890  
2         IT       Mary    202-231-7893  
2         A/R      Ann     202-231-7890  
2         Reg Mgr  Dave    402-321-3213  
3         Mgr      Bob     402-123-2345  
3         AsstMgr  Pete    402-123-2356

我想以下列优先顺序插入电话号码：

主要/单店StorePhone
主要/单店AltPhone
分店StorePhone
分店存储AltPhone
主要/单店联系电话
分店联系电话
- 如果目标表中已存在电话号码，请勿添加...

因此得到的数据集应为：

StoreID  Phone  
1        402-123-2300  (first pass)  
2        202-321-7800  
1        402-123-2345  (2nd pass)  
2        202-321-7890  
3        202-302-5600  (3rd & 4th pass - only add once)  
1        402-123-2346  (5th pass - skip dup)  
1        402-321-3213  
2        202-231-7893  (do not add dups)  
3        402-123-2356  (final pass - skip dup)

我优先选择要选择的重复项的电话号码的方法是根据其他条件（例如，主商店与分支）进行多次查询，将找到的第一个条目插入ANI查找表并跳过后续重复项。

如何在不使用RBAR的情况下执行此操作？我试过以下没有运气 - 实际上，它工作正常，直到我到达Store_Contacts表，其中可以有给定商店的多个相同的电话号码：

INSERT INTO dbo.Store_PhoneNumbers (StoreID, PhoneNumber)
    SELECT DISTINCT StoreID, dbo.GetPhoneNumber10(StorePhone)
    FROM dbo.Store_Info
    WHERE dbo.IsAniNumber(dbo.GetPhoneNumber10(StorePhone)) = 1
        AND ParentID = 0
        AND NOT EXISTS (SELECT * FROM dbo.Store_PhoneNumbers WHERE PhoneNumber = dbo.GetPhonenumber10(StorePhone));

...重复AltPhone，然后StorePhone，其中ParentID＆lt;＆gt; 0然后AltPhone w / ParentID＆lt;＆gt; 0

到目前为止一直很好，然后就是它崩溃的地方：

INSERT INTO dbo.Store_PhoneNumbers (StoreID, PhoneNumber)
    SELECT DISTINCT sc.StoreID, dbo.GetPhoneNumber10(sc.Phone)
    FROM Store_Contacts sc
            INNER JOIN
        Store_Info si ON sc.StoreID = si.StoreID
    WHERE (dbo.IsAniNumber(dbo.GetPhoneNumber10(sc.Phone)) = 1)
        AND (si.ParentID = 0)
        AND NOT EXISTS (SELECT * FROM dbo.Store_PhoneNumbers WHERE PhoneNumber = dbo.GetPhonenumber10(sc.Phone));

...并重复使用ParentID＆lt;＆gt; 0

这就是我得到重复条目的地方，插入失败。

感谢您给我的任何帮助，我即将放弃并使用光标，只是为了完成它... 戴夫

Answer 1

SELECT DISTINCT sc.StoreID, dbo.GetPhoneNumber10(sc.Phone)

DISTINCT错了。它将允许2个商店共享相同的号码。使用GROUP BY确保第二列是唯一的。

INSERT INTO dbo.Store_PhoneNumbers (StoreID, PhoneNumber)
SELECT MIN(StoreID), PhoneNumber
FROM
(
  SELECT sc.StoreID as StoreID, dbo.GetPhoneNumber10(sc.Phone) as PhoneNumber
  FROM Store_Contacts sc
      INNER JOIN
      Store_Info si ON sc.StoreID = si.StoreID
  WHERE (dbo.IsAniNumber(dbo.GetPhoneNumber10(sc.Phone)) = 1)
      AND (si.ParentID = 0)
      AND NOT EXISTS (SELECT * FROM dbo.Store_PhoneNumbers WHERE PhoneNumber = dbo.GetPhonenumber10(sc.Phone))
) sub
GROUP BY PhoneNumber

您可以在其他查询中脱颖而出的原因是您正在使用单个StoreID。此查询返回多个StoreID。

Answer 2

不仅仅是基于以下内容的查询：

SELECT StorePhone AS Phone -- , ...other columns...
    FROM StoreInfo
UNION
SELECT AltPhone AS Phone   -- , ...other columns...
    FROM StoreInfo
UNION
SELECT Phone               -- , ...other columns...
    FROM Store_Contacts

如果AltPhone可以为null，则可以添加WHERE子句以消除空值。我不清楚ANI或RBAR的意思。显然，只要网络相同，您就可以向不同的结果集添加额外的列。 UNION自动消除重复的行。

如果目的地表中已存在电话号码，请勿添加...

啊，那你需要MERGE语句。您可以使用上面查询的一个小变体作为数据来源合并到目标表中。

来自SQL 2003标准（第14.9节）的语句的BNF：

<merge statement> ::=
     MERGE INTO <target table> [ [ AS ] <merge correlation name> ]
     USING <table reference> ON <search condition>
     <merge operation specification>

<merge correlation name> ::= <correlation name>

<merge operation specification> ::= <merge when clause> ...

<merge when clause> ::=
    <merge when matched clause> |
    <merge when not matched clause>

<merge when matched clause> ::=
    WHEN MATCHED THEN <merge update specification>

<merge when not matched clause> ::=
    WHEN NOT MATCHED THEN <merge insert specification>

<merge update specification> ::= UPDATE SET <set clause list>

<merge insert specification>  ::=
     INSERT [ <left paren> <insert column list> <right paren> ]
     [ <override clause> ] VALUES <merge insert value list>

<merge insert value list> ::=
     <left paren> <merge insert value element>
     [ { <comma> <merge insert value element> }... ] <right paren>

<merge insert value element> ::=
     <value expression> |
     <contextually typed value specification>

您还可以在相关的产品手册中找到此声明的说明，这些说明通常会提供更多选项。在您的情况下，您可能只使用WHEN NOT MATCHED子句省略WHEN MATCHED子句。

补充观察：

MS SQL Server 2005不支持MERGE。

不是我所知道的唯一具有该限制的DBMS。

然后，您可能面临创建临时表并使用UNION-select语句中的数据加载它。

然后，您可以根据主数据表中相应行的不存在，从临时表中插入主数据表。至少，一些DBMS允许你这样做。我不是MS SQL Server专家，因此我不知道UPDATE语句的精细打印是否阻止您从UPDATE语句的子查询中更新的表中进行选择。如果你这么有限，这可能是一个真正的麻烦。

另一种选择是将表卸载到纯文本，并将UNION选择数据卸载到纯文本，然后使用文件系统（命令行）选项来处理它。这有多可行取决于我忘记的数据量。 Perl在这里很有用，将主表读入哈希，然后从UNION-select数据中选择性地更新它，最后将数据重写为加载文件。然后你'只是'启动事务，删除所有旧数据，加载所有新数据，交叉指和提交。这样做的缺点是卸载和加载之间的更改都会丢失。因此，如果您决定使用此技术，请务必小心。您可能希望在事务中执行卸载，并修改数据，然后删除并重新加载 - 所有这些都在同一个事务中完成。它需要一个按钮 - 按钮（返回键）来完成整个工作。

Answer 3

FYI，

ANI = http://en.wikipedia.org/wiki/Automatic_Number_Identification

RBAR =痛苦排行

Answer 4

我看到已经选择了一个答案，但如果我没有指出一个更简单，更通用的解决方案，那将是我的疏忽。

不要在插入顺序中隐含优先级，而是将其显式化。

你的问题基本上是，“我有几个数据源，我知道每个数据的优先级。对于每个键，我希望选择具有最高优先级的单个数据。”

首先为您的密钥（电话）选择所有可能的基准（storeid）：

create table prioritized_phone( phone char(12), storeid int, priority int);

insert into prioritized_phone(phone, storeid, priority) 
select storephone, storeid, 1  from store_info
union
select altphone, storeid, 2 from store_info

我不知道你是如何选择分店的手机的，但有一些查询可以通过在storeinfo中使用parentid来实现，如下所示：

union
select b.storephone, a.storeid, 3
from store_info a join storeinfo b on (a.parentid = b,storeid)
select b.altphone, a.storeid, 4
from store_info a join storeinfo b on (a.parentid = b,storeid)

然后是联系电话：

union 
select distinct phone, storeid, 5 from storecontacts;

完成后，对于每部手机，删除除最低（最佳）优先级之外的任何一个：

delete from prioritized_phone a where a.priority > 
(select min( priority) from prioritized_phone b where b.phone = a.phone);

现在，对于每部手机，我们只有最低优先级行。这仍然可能不是商店唯一的，所以我们随意选择手机的最低商店：

delete from prioritized_phone a where a.store_id > 
(select min( store_id ) from prioritized_phone b where b.phone = a.phone);

我们现在每个手机都有一个商店，但我们可能仍有欺骗行为：

create table phone_lookup( phone char(12), storeid int);

insert into phone_lookup(phone, storeid)
select distinct phone, storeid 
from prioritized_phone;

为什么这个解决方案更容易？因为它使您的解决方案中隐含的优先级（由操作顺序隐含）成为一个显式值，我们可以选择。

SQL查询从2个相关表中提取唯一的电话号码

4 个答案: