在这种情况下,选择不同的orderID的快捷方式是什么?

时间:2018-02-08 12:48:12

标签: sql sql-server datetime

假设我在SQL Server中有一些orderID。每个orderID可能有多个timestamps,可以为空。找到明确的orderID

的快捷方法是什么?
  • 所有都有非空timestamps
  • 昨天的最后一次timestamps

示例:

今天是2018-02-08,数据如下,结果应该是OrderID 11。

OrderID     | Timestamp
------------+-------------------------
   11       | 2018-02-07 10:08:52.740    
   11       | 2018-02-06 10:08:52.740    
   22       | 2018-02-03 10:08:52.740    
   22       | 2018-02-04 10:08:52.740    
   33       | 2018-02-07 10:08:52.740    
   33       | NULL

PS:这张表有大约10亿条记录。每个orderID有大约3到4个时间戳。

4 个答案:

答案 0 :(得分:3)

这是group byhaving

select orderid
from t
group by orderid
having count(timestamp) = count(*) and  -- no NULLs
       max(timestamp) >= dateadd(day, -1, cast(getdate() as date)) and
       max(timestamp) < cast(getdate() as date);

如果将某些逻辑移动到where子句,可能会稍快一些。这是有效的,因为您只关心NULL时间戳和昨天的时间戳:

select orderid
from t
where timestamp is null or
      (timestamp >= dateadd(day, -1, cast(getdate() as date)) and
       timestamp < cast(getdate() as date)
      )
group by orderid
having count(timestamp) = count(*) and  -- no NULLs
       max(timestamp) is not null;

在聚合之前过滤可以加快查询速度。

答案 1 :(得分:3)

虽然直接的解决方案是使用GROUP BY / HAVING / MIN / MAX,但是当您处理十亿行时需要使用WHERE:

  1. 查找行时间戳比今天更新 - 1
  2. 对它们进行分组并找到最大值低于今天的行(这可确保最近一行是昨天而不是今天)
  3. 对于生成的订单ID,请检查其COUNT all = COUNT不是空时间戳
  4. WITH cte AS (
        SELECT OrderID
        FROM testdata
        WHERE Timestamp >= CAST(CURRENT_TIMESTAMP - 1 AS DATE)
        GROUP BY OrderID
        HAVING MAX(Timestamp) < CAST(CURRENT_TIMESTAMP AS DATE)
    )
    SELECT testdata.OrderID
    FROM cte
    INNER JOIN testdata ON cte.OrderID = testdata.OrderID
    GROUP BY testdata.OrderID
    HAVING COUNT(*) = COUNT(Timestamp)
    

答案 2 :(得分:1)

您也可以尝试使用此查询:

;WITH CTE AS (
    SELECT OrderID, 
           FIRST_VALUE(Timestamp) OVER (PARTITION BY OrderID
                                        ORDER BY CASE 
                                                    WHEN Timestamp IS NULL THEN 0
                                                    ELSE 1
                                                 END, Timestamp DESC) AS first_timestamp
    FROM mytable
)
SELECT DISTINCT OrderID
FROM CTE 
WHERE  first_timestamp >= DATEADD(DAY, -1, CAST(GETDATE() as DATE)) AND
       first_timestamp < CAST(GETDATE() AS DATE);

Demo here

修改

假设以后没有日期,您可以使用GROUP BY对查询进行替换,请使用NOT EXISTS的以下查询:

SELECT DISTINCT OrderID
FROM mytable AS t
WHERE Timestamp >= DATEADD(DAY, -1, CAST(GETDATE() AS DATE)) AND

      -- Exclude OrderID slices that contain at least one `NULL` Timestamp
      NOT EXISTS (SELECT 1
                  FROM mytable AS x
                  WHERE x.OrderID = t.OrderID AND
                        x.Timestamp IS NULL)

      AND 

      -- Exclude OrderID slices with today's date, or any other future date,
      -- as last Timestamp 
      NOT EXISTS (SELECT 1
                  FROM mytable AS x
                  WHERE x.OrderID = t.OrderID AND
                        Timestamp >= DATEADD(DAY, 0, CAST(GETDATE() AS DATE)))

Demo here

答案 3 :(得分:0)

Select distinct(orderId) from Table 
where 
(day(timestamp)=day(DATEADD(day, -1, getdate()))
and month(timestamp)=month(DATEADD(day, -1, getdate()))
and year(timestamp)=year(DATEADD(day, -1, getdate()))
) and orderid not in (select distinct(orderid) where timestamp is null)

这将返回昨天发生的所有订单ID 编辑 只有那些没有单一空值的

我添加了

and orderid not in (select distinct(orderid) where timestamp is null)

这将选择所有具有空值的orderid。 因此我将它们排除在外