忽略重复记录SQL

时间:2017-11-13 14:28:30

标签: sql tsql datetime duplicates window-functions

需要一些帮助:)

所以我有一个包含以下列的记录表:

Key (PK, FK, int) DT (smalldatetime) Value (real)

DT是一天中每半小时的日期时间,具有相关值

E.g。

Key       DT                       VALUE
1000      2010-01-01 08:00:00      80
1000      2010-01-01 08:30:00      75
1000      2010-01-01 09:00:00      100

我有一个查询,它每24小时找到一个最大值及其相关时间,但是,在一天最大值出现两次,因此重复导致处理问题的日期。我尝试过使用rownumber(),但我不能在where子句中使用计算列? 目前我有:

SELECT       cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', ROW_NUMBER() over (PARTITION BY  cast(DT as date) ORDER BY DT) AS 'RowNumber'
FROM        TABLE_1 AS T1
INNER JOIN  (
                SELECT CAST([DT] as date) as 'DATE'
                ,       MAX([VALUE]) as 'MAX_HH'
                FROM    TABLE_1
                WHERE   DT > '6-nov-2016' and [KEY] = '1000'
                GROUP BY CAST([DT] as date)
            ) AS MAX_DT
        ON  MAX_DT.[DATE] = CAST(T1.[DT] as date)
        AND T1.VALUE = MAX_DT.MAX_HH
WHERE       DT > '6-nov-2016' and [KEY] = '1000'
ORDER BY DT

这导致

Key       DT               VALUE       HH
1000      2010-01-01       80          07:00:00
1000      2010-02-01       100         17:30:00
1000      2010-02-01       100         18:00:00

我需要删除重复的日期(我没有偏好HH它)

我想我已经解释得非常糟糕,让我知道它是否毫无意义,我会尝试重写

有什么想法吗?

4 个答案:

答案 0 :(得分:0)

你可以尝试这个新代码是** **:

 SELECT       cast(T1.DT as date) as 'Date', ** MIN(Cast(T1.DT as time(0))) as 'HH' **
    FROM        TABLE_1 AS T1
    INNER JOIN  (
                    SELECT CAST([DT] as date) as 'DATE'
                    ,       MAX([VALUE]) as 'MAX_HH'
                FROM    TABLE_1
                WHERE   DT > '6-nov-2016' and [KEY] = '1000'
                GROUP BY CAST([DT] as date)
            ) AS MAX_DT
        ON  MAX_DT.[DATE] = CAST(T1.[DT] as date)
        AND T1.VALUE = MAX_DT.MAX_HH
WHERE       DT > '6-nov-2016' and [KEY] = '1000'

这里把小组放在

GROUP BY cast(T1.DT as date)
ORDER BY DT

答案 1 :(得分:0)

我会做这样的事情 我没试过,但我认为这是正确的。

SELECT  cast(T1.DT as date) as 'Date',Cast(T1.DT as time(0)) as 'HH', VALUE 
FROM TABLE_1 T1      
       WHERE [DT] IN (       
       --select the max date from Table_1 for each day
            SELECT MAX([DT]) max_date FROM TABLE_1
            WHERE (CAST([DT] as date) ,value) IN 
            (
             SELECT CAST([DT] as date) as 'CAST_DATE'
              ,MAX([VALUE]) as 'MAX_HH'
              FROM    TABLE_1
              WHERE   DT > '6-nov-2016' and [KEY] = '1000'
             GROUP BY CAST([DT] as date
            )group by [DT]
           )
 WHERE       DT > '6-nov-2016' and [KEY] = '1000'

答案 2 :(得分:0)

JOIN更改为APPLY

APPLY操作允许您将连接关系限制为每个源关系只有一个结果。

SELECT v.[Key], cast(v.DT As Date) as "Date", v.[Value], cast(v.DT as Time(0)) as "HH"
FROM
(   -- First a projection to get just the exact dates you want
    SELECT DISTINCT [Key], CAST(DT as DATE) as DT 
    FROM Table_1 
    WHERE [Key] = '1000' AMD DT > '20161106'
) dates
CROSS APPLY (
    -- Then use APPLY rather than JOIN to find just the exact one record you need for each date
    SELECT TOP 1 * 
    FROM Table_1 
    WHERE [Key] = dates.[Key] AND cast(DT as DATE) = dates.DT ORDER BY [Value] DESC
) v

最后一点:此查询和问题中的示例查询都将包含2016年11月6日的值。查询显示> 2016-11-05具有独特的不等式,但原始仍在使用完整的DateTime值进行比较,意味着隐含的0作为时间成分。因此,11月6日的12:01 AM仍然比11月6日的12:00:00.001 AM还要大。如果您想要从查询中排除所有11月6日的日期,您需要更改此项以在结尾处使用时间值进行>比较之前的日期或演员

答案 3 :(得分:-1)

使用SQL,您可以使用SELECT DISTINCT

SELECT DISTINCT语句用于仅返回不同的(不同的)值。

在表格中,列通常包含许多重复值;有时您只想列出不同的(不同的)值。

SELECT DISTINCT语句用于仅返回不同的(不同的)值。