在多个表中返回第一个值实例(并返回关联的列)

时间:2010-08-27 09:35:24

标签: sql sql-server sql-server-2005 tsql

我希望为一组表中包含的每个联系人返回以下信息:

Contact_id    |    first_Event_date    |   first_Event_id    | event_type
id123456      |    01/12/2007          |   eveid123456       | table1
id456455      |    05/06/1999          |   eveid456585       | table4

数据反映了每个联系人曾经参与过的第一个事件(可以包含在最多8个表中的任何一个),而event_type告诉您事件来自哪个表。

我有以下查询脚本作为起点,它在尝试拉取contact_id和event_date时工作正常,但当我尝试也包含event_id时,它似乎从某个不正确的地方任意拉出一个ID:

SELECT
table1.contact_id               AS contact_id
MIN(table1.date_received)       AS event_date
table1.event_id                 AS event_id
FROM table1
GROUP BY table1.contact_id
UNION
SELECT
table2.contact_id               
MIN(table2.date_received)       
table2.event_id                 
FROM table2
GROUP BY table2.contact_id

表3-6重复了这一点。我知道我还需要在GROUP BY子句中包含table1.event_id等,但是当我这样做时,会返回每个联系人(对于每个表)的每个事件的所有提及,因此一个联系人为table1子查询返回了多行什么时候应该最多返回1行。

此外,如果有帮助,并非所有联系人都会出现在所有表中(但在整个表中至少出现一次),我正在使用sql server 2005.

提前致谢:)

3 个答案:

答案 0 :(得分:0)

尝试使用UNION ALL合并表格,以返回每个源表格的第一个结果,然后运行外部查询以获得最早的所有表格。

在此示例中,第一步选择临时表,第二步选择临时表。可以将此操作作为单个嵌套查询进行,但更难以阅读:

第1步 - 从每个表中获取每个联系人的最早行

SELECT  contact_id ,
        event_id ,
        date_received
INTO #firstEventsAllTables      
FROM    (   SELECT  contact_id ,
                    event_id ,
                    date_received,
                    ROW_NUMBER() OVER (PARTITION BY contact_id
                                       ORDER BY date_received
                                      ) AS rn
            FROM table1      
        ) AS t1
WHERE rn = 1

UNION ALL       


SELECT  contact_id ,
        event_id ,
        date_received
FROM    (   SELECT  contact_id ,
                    event_id ,
                    date_received,
                    ROW_NUMBER() OVER (PARTITION BY contact_id
                                       ORDER BY date_received
                                      ) AS rn
            FROM table2      
        ) AS t2
WHERE rn = 1

UNION ALL

etc...

第2步 - 找到所有表格中每个联系人最早的行

SELECT  contact_id ,
        event_id ,
        date_received
FROM    (   SELECT  contact_id ,
                    event_id ,
                    date_received,
                    ROW_NUMBER() OVER (PARTITION BY contact_id
                                       ORDER BY date_received
                                      ) AS rn
            FROM #firstEventsAllTables       
        ) AS f
WHERE rn = 1

(未测试的)

答案 1 :(得分:0)

首先,我很惊讶您的查询运行。 select子句中的每个字段都需要存在于Group By子句中,或者包含在某种聚合函数中。您的event_id字段不是,因此您应该收到错误。

其次,要获得与包含最小值的记录关联的“其他字段”,可以使用OVER关键字(为SQL2005添加)。以下查询将每个联系人的最小事件日期添加到结果集的每一行。

SELECT
  contact_id AS contact_id
  date_received AS event_date, 
  MIN(date_received) OVER (PARTITION BY contact_id) AS min_event_date
  event_id AS event_id
FROM table1

您不能将OVER位放入where子句中,因此您必须将其包装在子查询中以查找所需的记录。

SELECT contact_id, event_date, event_id
FROM (
  SELECT
    contact_id AS contact_id
    date_received AS event_date, 
    MIN(date_received) OVER (PARTITION BY contact_id) AS min_event_date
    event_id AS event_id
  FROM table1)
WHERE event_date = min_event_date

最终解决方案涉及双层子查询,我认为,UNION是最深的:

SELECT contact_id, event_date, event_id
FROM (
  SELECT
    contact_id
    event_date, 
    MIN(event_date) OVER (PARTITION BY contact_id) AS min_event_date
    event_date
  FROM (
    SELECT
      table1.contact_id AS contact_id
      table1.date_received AS event_date
      table1.event_id AS event_id
    FROM table1
    UNION
    SELECT
      table2.contact_id AS contact_id
      table2.date_received AS event_date
      table2.event_id AS event_id
    FROM table2)
WHERE event_date = min_event_date

答案 2 :(得分:0)

Ed Harper的方法可能是最好的方法。

在疯狂的版本中,只是在黑暗中进行一次有趣的尝试,试试这个:

WITH all AS (
   SELECT tbl = 'table1', contact_id, date_received, event_id FROM table1
   UNION ALL SELECT 'table2', contact_id, date_received, event_id FROM table2
   UNION ALL SELECT 'table3', contact_id, date_received, event_id FROM table3
   UNION ALL SELECT 'table4', contact_id, date_received, event_id FROM table4
) ranked AS (
   SELECT
      *, flag = row_number() OVER (PARTITION BY contact_id, ORDER BY date_received),
   FROM All
)
SELECT *
FROM ranked
WHERE flag = 1

看起来很简单,但可能表现不佳。请尝试一下,让我们知道它是如何做的。 :)