Question

我有一个大表，我一直在使用查询分析器，并寻找最佳方法来执行此操作。

表格如下：

name         rows        reserved     data         index_size   unused
table_name   110980132   7802944 KB   6119784 KB   1679320 KB   3840 KB

并且有以下列：

 ID int, time_stamp datetime, value1 float, value2 float, value3 float....

这些time_stamps是有时间的日期。我需要找到一种简单的方法，无需存储任何东西，只能获得表格的日期部分。最终，我可能需要知道日期+小时部分（而不是整个时间部分）。目前，我只需要知道我们过去30天的数据是什么（此时有时会缺少几天，这个问题/查询最终不只是寻找最后x天，而是所有的日子，或者随你）。

考虑到性能和时间，最好的方法是什么？我玩过group by，distinct，top x，rank()，临时表，观点...有些事情比其他事情好，但我做的事似乎不是大。

想法？谢谢！

Answer 1

-- Get the earliest date (without time) you want
DECLARE @smallestDate datetime = DATEADD(DAY, DATEDIFF(DAY, -30, GETDATE()), 0)

-- Select the distinct dates
SELECT DISTINCT DATEADD(DAY, DATEDIFF(DAY, 0, time_stamp), 0) AS [Date]
FROM yourTable
WHERE time_stamp > @smallestDate

这是一些性能比较 Most efficient way in SQL Server to get date from date+time?

Answer 2

如果您愿意使用T-SQL批处理，而不是单个查询，那么您可以使用这样的索引：

create table #tmp (date datetime primary key clustered);
declare @pivot datetime;
  insert #tmp
  select TOP(1) datediff(d,0,time_stamp)
    from tbl
order by time_stamp desc;
while @@rowcount > 0 and (select count(*) from #tmp) < 30
begin
      insert #tmp
      select TOP(1) datediff(d,0,time_stamp)
        from tbl
       where time_stamp < (select min(date) from #tmp)
    order by time_stamp desc;
end;

所有这些要求你是time_stamp的一个很好的索引，并且它将在该索引上执行正好30次搜索（或更少）。非常手术和快速。我把它作为一个概念，所以很明显，那里的2个标量子查询可以很容易地进行优化。

从具有日期时间的非常大的表中选择不同的日期（而不是时间）

2 个答案: