Question

我把我认为过于复杂的SQL放在一起，以达到我所追求的目标。我希望能够深入了解一种更快速，更简单的方法。

我所追求的是能够为两组中存在共同数据组的数据组分配ID。

例如，我有以下数据子集：

CustID  PartID  RplcID
28       4        4
28       4        16
28       4        17
28       16       4
28       16       16
28       16       17
28       17       4
28       17       16
28       17       17

我想为CustID = 28创建一个ID，其中RplcID和PartID存在重叠。因此，在此示例中，PartID 4,16,17都具有共同的RplcID（4,16,17）。因此，所有这些对都应具有相同的ID。

我正在使用的方法（并且使用临时表而不是仅使用CTE更快）除了大数据集之外，这个东西是S-L-O-W。我确信那里有一种更有效的方法，希望有人可以提供他们的专业知识。

我正在概述我目前的做法，尽可能清楚地了解我的混乱思想。

第1步使用由CustID划分的DENSE_RANK()生成临时ID，按PartID排序。

RowID   CustID  PartID  RplcID
1         28    16      16
1         28    17      16
1         28    4       16
2         28    16      17
2         28    17      17
2         28    4       17
3         28    16      4
3         28    17      4
3         28    4       4

第2步：然后使用这些结果并使用XML聚合PartID，以创建用于分组的逗号分隔字符串。

RowID   CustID  RplcID  PartIDS
4         28    16       16,17,4
4         28    17       16,17,4
4         28    4        16,17,4

第3步：最后，通过解析XML，使用分配的ID拆分这些组。

RowID   CustID  PartID  RplcID
4         28          16    16
4         28          16    17
4         28          16    4
4         28          17    16
4         28          17    17
4         28          17    4
4         28          4     16
4         28          4     17
4         28          4     4

整个SQL：

DECLARE @Parts TABLE
(
 CustID     VARCHAR(10),
 PartID     VARCHAR(10),
 RplcID     VARCHAR(10)

 )

Insert Into @Parts VALUES

('26','19','93'),('26','19','63'),
('26','31','93'),('26','31','63'),('26','32','93'),('26','32','63'),('26','33','93'),('26','33','63'),('26','34','93'),
('26','34','63'),('26','35','93'),('26','35','63'),('26','36','93'),('26','36','63'),('26','37','93'),('26','37','63'),
('26','38','93'),('26','38','63'),('26','39','93'),('26','39','63'),('27','40','95'),('27','41','94'),
('27','41','95'),('27','42','94'),('27','42','95'),('27','43','94'),('27','43','95'),('27','44','94'),('27','44','95'),
('27','45','94'),('27','45','95'),('27','46','94'),('27','46','95'),('27','47','94'),('27','47','95'),('27','48','94'),
('27','48','95'),('27','49','94'),('27','49','95'),('27','50','94'),('27','50','95'),('27','17','94'),('27','17','95'),
('27','51','94'),('27','51','95'),('27','52','94'),('27','52','95'),('27','53','94'),('27','53','95'),('27','54','94'),
('27','54','95'),('27','33','94'),('27','33','95'),('27','55','94'),('27','55','95'),('27','34','94'),('27','34','95'),
('27','56','94'),('27','56','95'),('27','35','94'),('27','35','95'),('27','57','94'),('27','57','95'),('27','58','94'),
('27','58','95'),('27','59','94'),('27','59','95'),('27','37','94'),('27','37','95'),('27','60','94'),('27','60','95'),
('27','61','94'),('27','61','95'),('27','62','94'),('27','62','95'),('27','63','94'),('27','63','95'),('27','64','94'),
('27','64','95'),('27','3','96'),('27','3','97'),('27','3','98'),('27','3','99'),('27','3','100'),('28','4','4'),
('28','4','16'),('28','4','17'),('28','16','4'),('28','16','16'),('28','16','17'),('28','17','4'),('28','17','16'),
('28','17','17')
;


--Step 1: Create the initial ID

SELECT DISTINCT DENSE_RANK() 
                  OVER( 
                    partition BY r.CustID 
                    ORDER BY r2.RplcID) AS RowID, 
                r.CustID, 
                r.BuyID, 
                r2.RplcID 
INTO #tmp
FROM   @Parts r 
       JOIN @Parts r1 
         ON r.CustID = r1.CustID 
            AND r.RplcID = r1.RplcID 
       JOIN @Parts r2 
         ON r.CustID = r2.CustID 
            AND r1.BuyID = r2.BuyID   


--Step 2: Group the BuyIDs
SELECT DENSE_RANK() 
         OVER( 
           ORDER BY CustID, BuyIDs) AS RowID, 
       * 
INTO #tmp2
FROM   (SELECT CustID, 
               Rtrim(RplcID) RplcID, 
               Stuff((SELECT ',' + Rtrim(BuyID) 
                      FROM   #tmp RSLT2 
                      WHERE  RSLT2.ROWID = RSLT.ROWID 
                             AND RSLT2.CustID = RSLT.CustID 
                      FOR xml path('')), 1, 1, '') [BuyIDs] 
        FROM   #tmp RSLT 
        GROUP  BY RSLT.CustID, 
                  RSLT.ROWID, 
                  RSLT.RplcID)A   



--Step 3: Using the grouped BuyIDs, split the strings using XML and assign RowID
SELECT RowID, 
       CustID, 
       BuyID,
       RplcID

INTO #tmp3
FROM    (SELECT RowID, 
               CustID, 
               n.r.value('.','varchar(10)') AS BuyID, 
               RplcID
        FROM   #tmp2 
               CROSS APPLY(SELECT Cast('<r>' + Replace(BuyIDs, ',', '</r><r>') 
                                       + '</r>' AS XML)) AS S(xmlcol) 
               CROSS APPLY s.xmlcol.nodes('r') AS n(r))A 
Order by RowID  

Select * from #tmp3 where CustID='28'

Select distinct BuyID 
from #tmp3
where CustID='28'   

Select distinct RplcID 
from #tmp3
where CustID='28'

根据共性/组分配ID

0 个答案: