我正在尝试编写SQL以生成以下数据
Date Count
2018-09-24 2
2018-09-25 2
2018-09-26 2
2018-09-27 2
2018-09-28 2
2018-09-29 1
我正在使用的基本表的示例是
ID StartDate EndDate
187267 2018-09-24 2018-10-01
187270 2018-09-24 2018-09-30
因此,我试图获取两个日期之间的日期列表,然后计算每个日期中有多少基本数据记录。
我开始使用临时表并尝试遍历记录以获取结果,但是我不确定这是否是正确的方法。
到目前为止,我已经有了此代码
WITH ctedaterange
AS (SELECT [Dates] = (select ea.StartWork from EngagementAssignment ea where ea.EngagementAssignmentId IN(SELECT ea.EngagementAssignmentId
FROM EngagementLevel el INNER JOIN
EngagementAssignment ea ON el.EngagementLevelID = ea.EngagementLevelId
WHERE el.JobID = 15072 and ea.AssetId IS NOT NULL))
UNION ALL
SELECT [dates] + 1
FROM ctedaterange
WHERE [dates] + 1 < = (select ea.EndWork from EngagementAssignment ea where ea.EngagementAssignmentId IN(SELECT ea.EngagementAssignmentId
FROM EngagementLevel el INNER JOIN
EngagementAssignment ea ON el.EngagementLevelID = ea.EngagementLevelId
WHERE el.JobID = 15072 and ea.AssetId IS NOT NULL)))
SELECT [Dates], Count([Dates])
FROM ctedaterange
GROUP BY [Dates]
但我收到此错误
子查询返回了多个值。当子查询遵循=,!=,<,<=,>,> =或将子查询用作表达式时,不允许这样做。
当我使用的作业仅在where子句的subselect中生成一条记录时,我得到正确的结果,即:
SELECT ea.EngagementAssignmentId
FROM EngagementLevel el INNER JOIN
EngagementAssignment ea ON el.EngagementLevelID = ea.EngagementLevelId
WHERE el.JobID = 15047 and ea.AssetId IS NOT NULL
产生一条记录。
结果如下:
Dates (No column name)
2018-09-24 02:00:00.000 1
2018-09-25 02:00:00.000 1
2018-09-26 02:00:00.000 1
2018-09-27 02:00:00.000 1
2018-09-28 02:00:00.000 1
2018-09-29 02:00:00.000 1
2018-09-30 02:00:00.000 1
2018-10-01 02:00:00.000 1
答案 0 :(得分:0)
尝试一下:demo
sdate total
2018-09-24 2
2018-09-25 2
2018-09-26 2
2018-09-27 2
2018-09-28 2
2018-09-29 2
2018-09-30 1
输出:
from itertools import chain
df = pd.DataFrame({
'Feature' : df['Feature'].values.repeat(df['Class'].str.len()),
'Class' : list(chain.from_iterable(df['Class'].values.tolist()))
})
print (df)
Feature Class
0 text1 label1
1 text1 label2
2 text2 label2
3 text2 label3
答案 1 :(得分:0)
您可以通过更改日期和日期来根据您的范围生成
DECLARE
@DateFrom DATETIME = GETDATE(),
@DateTo DATETIME = '2018-10-30';
WITH DateGenerate
AS (
SELECT @DateFrom as MyDate
UNION ALL
SELECT DATEADD(DAY, 1, MyDate)
FROM DateGenerate
WHERE MyDate < @DateTo
)
SELECT
MyDate
FROM
DateGenerate;
答案 2 :(得分:0)
好吧,如果您的日期范围很短,则可以使用递归CTE,如其他答案所示。递归CTE的问题在于范围很大,开始变得无效-因此,我想向您展示一种不同的方法,该方法无需使用递归即可构建日历CTE。
首先,创建并填充示例表(请在您将来的问题中为我们保存此步骤):
DECLARE @T AS TABLE
(
ID int,
StartDate date,
EndDate date
)
INSERT INTO @T (ID, StartDate, EndDate) VALUES
(187267, '2018-09-24', '2018-10-01'),
(187270, '2018-09-24', '2018-09-30')
然后,在日历cte中获取第一个开始日期和所需的日期数:
DECLARE @DateDiff int, @StartDate Date
SELECT @DateDiff = DATEDIFF(DAY, MIN(StartDate), Max(EndDate)),
@StartDate = MIN(StartDate)
FROM @T
现在,基于row_number
构建日历cte(也就是说,除非您已经有一个可以使用的数字(计数)表):
;WITH Calendar(TheDate)
AS
(
SELECT TOP(@DateDiff + 1) DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY @@SPID)-1, @StartDate)
FROM sys.objects t0
-- unremark the next row if you don't get enough records...
-- CROSS JOIN sys.objects t1
)
请注意,我正在使用row_number() - 1
,因此必须选择top(@DateDiff + 1)
最后-查询:
SELECT TheDate, COUNT(ID) As NumberOfRecords
FROM Calendar
JOIN @T AS T
ON Calendar.TheDate >= T.StartDate
AND Calendar.TheDate <= T.EndDate
GROUP BY TheDate
结果:
TheDate | NumberOfRecords
2018-09-24 | 2
2018-09-25 | 2
2018-09-26 | 2
2018-09-27 | 2
2018-09-28 | 2
2018-09-29 | 2
2018-09-30 | 2
2018-10-01 | 1
答案 3 :(得分:0)
您能在我使用SQL dates table function [dbo]。[DatesTable]的情况下尝试执行SQL CTE查询吗,它在源表中生成介于最小日期和最大日期之间的日期列表
;with boundaries as (
select
min(StartDate) minD, max(EndDate) maxD
from DateRanges
), dates as (
select
dates.[date]
from boundaries
cross apply [dbo].[DatesTable](minD, maxD) as dates
)
select dates.[date], count(*) as [count]
from dates
inner join DateRanges
on dates.date between DateRanges.StartDate and DateRanges.EndDate
group by dates.[date]
order by dates.[date]
输出符合预期