从日期范围中查找缺少的日期

时间:2013-08-13 11:19:39

标签: mysql date left-join

我有关于获取数据库表中不存在的日期的查询。

我在数据库中有以下日期。

2013-08-02
2013-08-02
2013-08-02
2013-08-03
2013-08-05
2013-08-08
2013-08-08
2013-08-09
2013-08-10
2013-08-13
2013-08-13
2013-08-13

我想要的结果如下所示,

2013-08-01
2013-08-04
2013-08-06
2013-08-07
2013-08-11
2013-08-12

正如您所见,结果有六个日期,这些日期不存在于数据库中,

我试过下面的查询

SELECT
    DISTINCT DATE(w1.start_date) + INTERVAL 1 DAY AS missing_date
FROM
    working w1
LEFT JOIN
    (SELECT DISTINCT start_date FROM working ) w2 ON DATE(w1.start_date) = DATE(w2.start_date) - INTERVAL 1 DAY
WHERE
    w1.start_date BETWEEN '2013-08-01' AND '2013-08-13'
AND
    w2.start_date IS NULL;

但上面返回以下结果。

2013-08-04
2013-08-14
2013-08-11
2013-08-06

你可以看到它给了我四个日期,不需要它,但由于左连接,它仍然不包含3个日期。

现在请查看我的查询,让我知道我能做到这一点的最佳方法是什么?

感谢您寻找并给予时间。

6 个答案:

答案 0 :(得分:16)

我猜你总是可以生成日期序列,只需使用NOT IN来消除实际存在的日期。这将在1024天范围内最大化,但很容易缩小或扩展,日期列称为“mydate”,并在表“table1”中;

SELECT * FROM (
  SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) day 
  FROM 
   (SELECT 0 t4    UNION ALL SELECT 1   UNION ALL SELECT 2   UNION ALL SELECT 3  ) t4,
   (SELECT 0 t16   UNION ALL SELECT 4   UNION ALL SELECT 8   UNION ALL SELECT 12 ) t16,   
   (SELECT 0 t64   UNION ALL SELECT 16  UNION ALL SELECT 32  UNION ALL SELECT 48 ) t64,      
   (SELECT 0 t256  UNION ALL SELECT 64  UNION ALL SELECT 128 UNION ALL SELECT 192) t256,     
   (SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024     
  ) b 
WHERE day NOT IN (SELECT mydate FROM Table1) AND day<'2013-08-13';

从“我会添加一个SQLfiddle,如果它没有关闭”部门。

感谢您的帮助,我最终得到的查询及其工作

SELECT * FROM
(
    SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) missingDates 
        FROM 
    (SELECT 0 t4    UNION ALL SELECT 1   UNION ALL SELECT 2   UNION ALL SELECT 3  ) t4,
    (SELECT 0 t16   UNION ALL SELECT 4   UNION ALL SELECT 8   UNION ALL SELECT 12 ) t16,   
    (SELECT 0 t64   UNION ALL SELECT 16  UNION ALL SELECT 32  UNION ALL SELECT 48 ) t64,      
    (SELECT 0 t256  UNION ALL SELECT 64  UNION ALL SELECT 128 UNION ALL SELECT 192) t256,     
    (SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024     
) b 
WHERE
    missingDates NOT IN (SELECT DATE_FORMAT(start_date,'%Y-%m-%d')
            FROM
                working GROUP BY start_date)
    AND
    missingDates < '2013-08-13';

答案 1 :(得分:3)

我打赌可能会创建一个专用的Calendar表,以便能够在LEFT JOIN上使用它。

您可以根据需要创建表,但由于它不会代表如此大量的数据,因此最简单且可能最有效的方法是为所有人创建一次,如下所示,使用存储过程:

--
-- Create a dedicated "Calendar" table
--
CREATE TABLE Calendar (day DATE PRIMARY KEY);

DELIMITER //
CREATE PROCEDURE init_calendar(IN pStart DATE, IN pEnd DATE)
BEGIN
    SET @theDate := pStart;
    REPEAT
        -- Here I use *IGNORE* in order to be able
        -- to call init_calendar again for extend the
        -- "calendar range" without to bother with
        -- "overlapping" dates
        INSERT IGNORE INTO Calendar VALUES (@theDate);
        SET @theDate := @theDate + INTERVAL 1 DAY;
    UNTIL @theDate > pEnd END REPEAT;
END; //
DELIMITER ;

CALL init_calendar('2010-01-01','2015-12-31');

在此示例中,日历持续2191天连续几天,大致估计小于15KB。存储21世纪的所有日期将少于300KB ......

现在,这是您在问题中描述的实际数据表:

--
-- *Your* actual data table
--
CREATE TABLE tbl (theDate DATE);
INSERT INTO tbl VALUES 
    ('2013-08-02'),
    ('2013-08-02'),
    ('2013-08-02'),
    ('2013-08-03'),
    ('2013-08-05'),
    ('2013-08-08'),
    ('2013-08-08'),
    ('2013-08-09'),
    ('2013-08-10'),
    ('2013-08-13'),
    ('2013-08-13'),
    ('2013-08-13');

最后查询:

--
-- Now the query to find date not "in range"
--

SET @start = '2013-08-01';
SET @end = '2013-08-13';

SELECT Calendar.day FROM Calendar LEFT JOIN tbl
    ON Calendar.day = tbl.theDate
    WHERE Calendar.day BETWEEN @start AND @end
    AND tbl.theDate IS NULL;

产:

+------------+
| day        |
+------------+
| 2013-08-01 |
| 2013-08-04 |
| 2013-08-06 |
| 2013-08-07 |
| 2013-08-11 |
| 2013-08-12 |
+------------+

答案 2 :(得分:2)

我会这样做:

$db_dates = array (
'2013-08-02',
'2013-08-03',
'2013-08-05',
'2013-08-08',
'2013-08-09',
'2013-08-10',
'2013-08-13'
);
$missing = array();
$month = "08";
$year = "2013";
$day_start = 1;
$day_end = 14
for ($i=$day_start; $i<$day_end; $i++) {
    $day = $i;
    if ($i<10) {
        $day = "0".$i;  
    }
    $check_date = $year."-".$month."-".$day;
    if (!in_array($check_date, $db_dates)) {
        array_push($missing, $check_date);  
    }
}
print_r($missing);

我只是按照这个间隔制作了它,但你可以定义另一个间隔或让它适用于整年。

答案 3 :(得分:0)

我在数据仓库类型的情况下解决这个问题的方法是在适当的时间段内填充一个带有日期的“静态”表(这类事件的示例脚本是easy到{{ 3}})然后left outer joinright outer join你的表:没有匹配的行是缺少的日期。

答案 4 :(得分:0)

DECLARE @date date;
declare @dt_cnt int = 0;
set @date='2014-11-1';
while @date < '2014-12-31'
begin
  select @dt_cnt = COUNT(att_id) from date_table where att_date=@date ;

      if(@dt_cnt = 0) 
      BEGIN
         print @date
      END
      set @date = DATEADD(day,1,@date);
end

答案 5 :(得分:0)

如果有人想要超过1024天(或几小时),我会将此添加到Dipesh的优秀答案中。我从2015年到2046年的生成时间低于279936小时:

    SELECT 
DATE_ADD('2015-01-01', INTERVAL 
POWER(6,6)*t6 + POWER(6,5)*t5 + POWER(6,4)*t4 + POWER(6,3)*t3 + POWER(6,2)*t2 + 
POWER(6,1)*t1 + t0 
HOUR) AS period
FROM
 (SELECT 0 t0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t0,
 (SELECT 0 t1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t1,
 (SELECT 0 t2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t2,
 (SELECT 0 t3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t3,
 (SELECT 0 t4 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t4,
 (SELECT 0 t5 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t5,
 (SELECT 0 t6 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t6
 ORDER BY period

只需将其插入到答案查询中即可。