使用第二个表中的不同值进行T-SQL连接查询

时间:2017-11-09 06:00:30

标签: sql-server tsql sql-server-2016

我知道这个问题可能听起来像是重复的,但我已经完成了我能找到的每一个问题;虽然它仍然可能是一个我可能错过的问题的副本。

我的表面价值似乎是一个微不足道的要求,但无论我如何编写它,总会有一些警告只是不起作用。我已尝试GROUPDISTINCTJOIN,汇总功能等。

方案: PRIMARYTABLE包含一组广告系列,SECONDARYTABLE包含投放广告系列的日期。每个广告系列可以有多次投放,每次投放都包含SUBKEY

要求: 我需要能够将最近投放的广告系列放入列表中,以便用户可以更轻松地从最常运行的广告系列中进行选择。

PRIMARYTABLE
KEYCOLUMN   INFOCOLUMN
100000      Test 1
100001      Test Campaign
100002      Test Image 2
100003      Test Img
100004      Image Test
100005      Test
100006      Test Image 3
100007      Test Image 4
100008      Test Image 5
100009      Image Comparison Test 2
100010      Testing
100011      Test Fields
100012      Test 5
100013      test

SECONDARYTABLE
KEYCOLUMN   SUBKEY  DATECOLUMN
100000      100000  2017-06-02 04:09:57.593
100001      100001  2017-06-19 12:09:54.093
100001      100002  2017-06-27 10:51:14.140
100004      100003  2017-06-27 12:33:47.747
100006      100004  2017-06-28 10:29:53.387
100007      100005  2017-06-28 10:36:23.710
100008      100006  2017-06-29 22:31:03.790
100009      100007  2017-06-29 23:07:52.870
100009      100010  2017-10-04 16:05:40.583
100009      100011  2017-10-04 16:09:55.470
100011      100008  2017-09-08 14:02:28.017
100012      100009  2017-09-11 16:17:23.870
100013      100012  2017-11-07 16:55:55.403
100013      100013  2017-11-08 15:37:16.430

以下是我或多或少的想法。

SELECT DISTINCT( a.[INFOCOLUMN] )
FROM [PRIMARYTABLE] a
INNER JOIN [SECONDARYTABLE] b ON ( a.[KEYCOLUMN] = b.[KEYCOLUMN] )
ORDER BY a.[DATECOLUMN]

希望有一个荷马辛普森“Doh!”一旦我看到它应该如何完成那一刻。

非常感谢。

3 个答案:

答案 0 :(得分:1)

你可以试试这个:

DECLARE @PRIMARYTABLE TABLE
(
    [KEYCOLUMN] INT 
   ,[INFOCOLUMN] VARCHAR(24)
);

DECLARE @SECONDARYTABLE TABLE
(
    [KEYCOLUMN] INT 
   ,[SUBKEY] INT
   ,[DATECOLUMN] DATETIME2
);

INSERT INTO @PRIMARYTABLE ([KEYCOLUMN], [INFOCOLUMN])
VALUES (100000, 'Test 1')
      ,(100001, 'Test Campaign')
      ,(100002, 'Test Image 2')
      ,(100003, 'Test Img')
      ,(100004, 'Image Test')
      ,(100005, 'Test')
      ,(100006, 'Test Image 3')
      ,(100007, 'Test Image 4')
      ,(100008, 'Test Image 5')
      ,(100009, 'Image Comparison Test 2')
      ,(100010, 'Testing')
      ,(100011, 'Test Fields')
      ,(100012, 'Test 5')
      ,(100013, 'test');

INSERT INTO @SECONDARYTABLE ([KEYCOLUMN], [SUBKEY], [DATECOLUMN])
VALUES (100000, 100000, '2017-06-02 04:09:57.593')
      ,(100001, 100001, '2017-06-19 12:09:54.093')
      ,(100001, 100002, '2017-06-27 10:51:14.140')
      ,(100004, 100003, '2017-06-27 12:33:47.747')
      ,(100006, 100004, '2017-06-28 10:29:53.387')
      ,(100007, 100005, '2017-06-28 10:36:23.710')
      ,(100008, 100006, '2017-06-29 22:31:03.790')
      ,(100009, 100007, '2017-06-29 23:07:52.870')
      ,(100009, 100010, '2017-10-04 16:05:40.583')
      ,(100009, 100011, '2017-10-04 16:09:55.470')
      ,(100011, 100008, '2017-09-08 14:02:28.017')
      ,(100012, 100009, '2017-09-11 16:17:23.870')
      ,(100013, 100012, '2017-11-07 16:55:55.403')
      ,(100013, 100013, '2017-11-08 15:37:16.430');


SELECT a.[INFOCOLUMN] 
      ,b.[DATECOLUMN]
FROM @PRIMARYTABLE A
CROSS APPLY
(
    SELECT TOP 1 [DATECOLUMN]
    FROM @SECONDARYTABLE  B
    WHERE A.[KEYCOLUMN] = B.[KEYCOLUMN]
    ORDER BY [DATECOLUMN] DESC
) b;

它将为您提供每个广告系列的最后执行次数。您可以按日期或ORDER BY过滤,并从最终查询中获取TOP N.

或者您可以使用ROW_NUMBER

WITH DataSource AS
(
    SELECT A.[INFOCOLUMN]
          ,B.[DATECOLUMN]
          ,ROW_NUMBER() OVER (PARTITION BY A.[KEYCOLUMN] ORDER BY B.[KEYCOLUMN]) AS [RowID]
    FROM @PRIMARYTABLE A
    INNER JOIN @SECONDARYTABLE B
        ON A.[KEYCOLUMN] = B.[KEYCOLUMN]
)
SELECT [INFOCOLUMN]
      ,[DATECOLUMN]
FROM DataSource
WHERE [RowID] = 1;

答案 1 :(得分:1)

试试这个,它会以最常用的顺序返回广告系列列表。注意广告系列永不运行不会出现在列表中。在这种情况下,您将进行左连接

SELECT a.[INFOCOLUMN] 
FROM   [PRIMARYTABLE] a 
 /* left */ JOIN [SECONDARYTABLE] b ON a.[KEYCOLUMN] = b.[KEYCOLUMN] 
group BY a.[infocolumn]
order by max(datecolumn) desc

这是我测试它的存根

select 10000 id,'Campain A' cname into #a1 union all
select 10002,'Campain B' union all
select 10004,'Campain C' union all
select 10009,'Campain E' 

select 10000 id,'20170101' thedate into #a2 union all
select 10000,'20170102' union all
select 10009,'20170103' union all
select 10002,'20170104' union all
select 10004,'20170105' union all
select 10000,'20170201' union all
select 10000,'20170302' union all
select 10009,'20170403' union all
select 10002,'20170104' union all
select 10004,'20170205' union all
select 10000,'20170101' union all
select 10004,'20170302' union all
select 10000,'20170103' union all
select 10002,'20170404' union all
select 10002,'20170105' 

select #a1.cname
 from #a1 join #a2 on #a1.id = #a2.id 
 group by #a1.cname
 order by max(thedate) desc

答案 2 :(得分:1)

  1. the most recently run campaigns>>使用row_number()over(.. order by ... DESC)
  2. that get run the most frequent>>使用count(*)over(partition by ..)
  3. 使用窗口函数row_number() over()count() over()可以按“最新”数据行选择并按“最常用”排序。请注意,DESCending日期顺序会导致“recent”= 1。

    select
           p.*, s.*
    from PRIMARYTABLE p
    inner join (
          select KEYCOLUMN, SUBKEY, DATECOLUMN
               , row_number() over(partition by KEYCOLUMN order by DATECOLUMN DESC) recent
               , count(*) over(partition by KEYCOLUMN) frequency
          from SECONDARYTABLE
          ) s on p.KEYCOLUMN = s.KEYCOLUMN  and s.recent = 1
    order by s.frequency DESC, p.INFOCOLUMN