自我加入到组的最低发生率

时间:2010-10-13 17:13:08

标签: tsql sql-server-2008 sql-order-by self-join

我在T-SQL中遇到了一个难以解决的问题。

我有一个包含记录组的表,按key1和key2分组。我按时间顺序按顺序排列每个组。对于每条记录,我想查看是否存在之前的记录(在组内且日期较低),其中“datafield”字段与当前记录的“datafield”形成允许的组合。对于允许的组合,我有一个名为AllowedCombinationsTable的表。

我写了以下代码来实现它:

WITH Source AS (
    SELECT key1, key2, datafield, date1,
        ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
        FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
    ON D.datafield1 = L.datafield
LEFT JOIN Source AS R
    ON R.Key1 = L.Key1
    AND R.Key2 = L.Key2
    AND R.dateorder < L.dateorder
    AND DC.datafield2 = L.datafield
    -- AND "pick the one record with lowest dateorder"

现在,对于这些可能的组合记录中的每一个,我想选择第一个(参见代码中的占位符)。我怎样才能最有效地做到这一点?


编辑:好的,让我们说源,只显示组(1,1):

**Key1 Key2 Datafield Date DateOrder**
1 1 "Horse" 1-Jan-2010 1
1 1 "Horse" 2-Jan-2010 2
1 1 "Sheep" 3-Jan-2010 3
1 1 "Dog" 4-Jan-2010 4
1 1 "Cat" 5-Jan-2010 5

AllowedCombinationsTable:

**Datafield1 Datafield**
Cat Sheep (and Sheep Cat)
Cat Horse (and Horse Cat)
Dog Horse (and Horse Dog)

我加入后我现在:

**Key1 Key2 Datafield Date DateOrder JoinedCombination JoinedCombinationDateOrder**
1 1 "Horse" 1-Jan-2010 1 NULL NULL
1 1 "Horse" 2-Jan-2010 2 NULL NULL
1 1 "Sheep" 3-Jan-2010 3 NULL NULL
1 1 "Dog" 4-Jan-2010 4 "Horse" 1
1 1 "Dog" 4-Jan-2010 4 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Horse" 1
1 1 "Cat" 5-Jan-2010 5 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Sheep" 3

我想只显示第一个“Horse”用于记录4“Dog”,也只显示第一个“Horse”用于记录5“Cat”。

得到它? ;)

2 个答案:

答案 0 :(得分:0)

我认为这可能会这样做 - 没有设置数据来测试查询。检查评论是否有理由。

WITH Source AS ( 
    SELECT key1, key2, datafield, date1, 
        ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder 
        FROM table 
) 
SELECT L.key1, L.key2, L.datafield, DC.datafield2 
FROM Source AS L 
LEFT JOIN AllowedDataCombinationsTable DC 
    ON DC.datafield1 = L.datafield   --  DC Alias
LEFT JOIN Source AS R 
    ON R.Key1 = L.Key1 
    AND R.Key2 = L.Key2 
    AND DC.datafield2 = R.datafield   --  Changed alias from L to R
    AND R.dateorder = 1               --  Pick out lowest one
    AND R.dateorder < L.dateorder     --  Make sure it's not the same one

答案 1 :(得分:0)

好吧,我不使用WITHOVER,所以这是一种不同的方法..我可能过度简化了一些事情,但没有在我面前的数据这是什么我提出了:

SELECT distinct a.Key1, a.Key2, a.Datafield, 
       ISNULL(b.Datafield,'') as Datafield1, 
       ISNULL(b.Date,a.Date) as `Date`, 
       MIN(a.DateOrder) as DateOrder
FROM Source a
LEFT JOIN Source b 
     ON a.Key1 = b.Key1
     AND a.Key2 = b.Key2
     AND a.Dateorder <> b.Dateorder
LEFT JOIN AllowedDataCombinationsTable c
     ON a.Datafield = c.Datafield
     AND b.Datafield = c.Datafield1
GROUP BY a.Key1, a.Key2, a.Datafield, ISNULL(b.Datafield,''), ISNULL(b.Date,a.Date)