编辑:
我正在使用Sql Server 2005,并且我试图获得当年(例如6月1日 - 5月30日)和过去3年中不同用户的年度(YOY)数量。我能够通过运行四次选择语句来做我需要的事情,但我现在似乎无法找到更好的方法。我可以在一个查询中获得每年的独特计数,但我需要累计不同的计数。下面是我到目前为止的模型:
SELECT [Year], COUNT(DISTINCT UserID) FROM ( SELECT u.uID AS UserID, CASE WHEN dd.ddEnd BETWEEN @yearOneStart AND @yearOneEnd THEN 'Year1' WHEN dd.ddEnd BETWEEN @yearTwoStart AND @yearTwoEnd THEN 'Year2' WHEN dd.ddEnd BETWEEN @yearThreeStart AND @yearThreeEnd THEN 'Year3' WHEN dd.ddEnd BETWEEN @yearFourStart AND @yearFourEnd THEN 'Year4' ELSE 'Other' END AS [Year] FROM Users AS u INNER JOIN UserDataIDMatch AS udim ON u.uID = udim.udim_FK_uID INNER JOIN DataDump AS dd ON udim.udimUserSystemID = dd.ddSystemID ) AS Data WHERE LOWER([Year]) 'other' GROUP BY [Year]
我得到类似的东西:
Year1 1 Year2 1 Year3 1 Year4 1
但我真的需要:
Year1 1 Year2 2 Year3 3 Year4 4
下面是一个粗略的架构和一组值(为简单起见而更新)。我试图创建一个SQL小提琴,但是当我尝试构建架构时,我遇到了磁盘空间错误。
CREATE TABLE Users ( uID int identity primary key, uFirstName varchar(75), uLastName varchar(75) ); INSERT INTO Users (uFirstName, uLastName) VALUES ('User1', 'User1'), ('User2', 'User2') ('User3', 'User3') ('User4', 'User4'); CREATE TABLE UserDataIDMatch ( udimID int indentity primary key, udim.udim_FK_uID int foreign key references Users(uID), udimUserSystemID varchar(75) ); INSERT INTO UserDataIDMatch (udim_FK_uID, udimUserSystemID) VALUES (1, 'SystemID1'), (2, 'SystemID2'), (3, 'SystemID3'), (4, 'SystemID4'); CREATE TABLE DataDump ( ddID int identity primary key, ddSystemID varchar(75), ddEnd datetime ); INSERT INTO DataDump (ddSystemID, ddEnd) VALUES ('SystemID1', '10-01-2013'), ('SystemID2', '10-01-2014'), ('SystemID3', '10-01-2015'), ('SystemID4', '10-01-2016');
答案 0 :(得分:1)
我做了类似的事情,找出了多年来购买东西的不同客户的数量,我修改它以使用你的年份概念,你添加的变量将是开始日和年度开始月份,开始年度和结束年度。
从技术上讲,有一种方法可以避免使用循环,但这是非常清楚的,你不能超过9999年,所以不要放置聪明的代码来避免循环有意义
同样,当匹配日期时,请确保您正在比较日期,而不是比较列的功能评估,因为这意味着在每个记录集上运行该函数,如果它们存在于日期(它们应该),则会使索引无效。使用日期添加 为了启动您的目标日期,从一年中减去1900,从月份减去一个,从目标日期减去一个。
然后自我加入表格,其中日期创建有效范围(即年份到年份)并使用子查询基于该范围创建总和。由于您希望从第一年到最后一次限制累积结果从第一年开始。
最后你会错过第一年,因为根据我们的定义它不符合范围,要解决这个问题,只需在你创建的临时表上做一个联合,以添加缺少的年份和不同值的数量在它。
DECLARE @yearStartMonth INT = 6, @yearStartDay INT = 1
DECLARE @yearStart INT = 2008, @yearEnd INT = 2012
DECLARE @firstYearStart DATE =
DATEADD(day,@yearStartDay-1,
DATEADD(month, @yearStartMonth-1,
DATEADD(year, @yearStart- 1900,0)))
DECLARE @lastYearEnd DATE =
DATEADD(day, @yearStartDay-2,
DATEADD(month, @yearStartMonth-1,
DATEADD(year, @yearEnd -1900,0)))
DECLARE @firstdayofcurrentyear DATE = @firstYearStart
DECLARE @lastdayofcurrentyear DATE = DATEADD(day,-1,DATEADD(year,1,@firstdayofcurrentyear))
DECLARE @yearnumber INT = YEAR(@firstdayofcurrentyear)
DECLARE @tempTableYearBounds TABLE
(
startDate DATE NOT NULL,
endDate DATE NOT NULL,
YearNumber INT NOT NULL
)
WHILE @firstdayofcurrentyear < @lastYearEnd
BEGIN
INSERT INTO @tempTableYearBounds
VALUES(@firstdayofcurrentyear,@lastdayofcurrentyear,@yearNumber)
SET @firstdayofcurrentyear = DATEADD(year,1,@firstdayofcurrentyear)
SET @lastdayofcurrentyear = DATEADD(year,1,@lastdayofcurrentyear)
SET @yearNumber = @yearNumber + 1
END
DECLARE @tempTableCustomerCount TABLE
(
[Year] INT NOT NULL,
[CustomerCount] INT NOT NULL
)
INSERT INTO @tempTableCustomerCount
SELECT
YearNumber as [Year],
COUNT(DISTINCT CustomerNumber) as CutomerCount
FROM Ticket
JOIN @tempTableYearBounds ON
TicketDate >= startDate AND TicketDate <=endDate
GROUP BY YearNumber
SELECT * FROM(
SELECT t2.Year as [Year],
(SELECT
SUM(CustomerCount)
FROM @tempTableCustomerCount
WHERE Year>=t1.Year
AND Year <=t2.Year) AS CustomerCount
FROM @tempTableCustomerCount t1 JOIN @tempTableCustomerCount t2
ON t1.Year < t2.Year
WHERE t1.Year = @yearStart
UNION
SELECT [Year], [CustomerCount]
FROM @tempTableCustomerCount
WHERE [YEAR] = @yearStart
) tt
ORDER BY tt.Year
效率不高,但最后你正在处理的临时表是如此之小,我认为它并不重要,并且与你使用的方法相比增加了更多的多功能性。
更新:我更新了查询以反映您想要的数据集结果,我基本上测试的是看它是否更快,它更快10秒但是我处理的数据集与相对较小。 (从12秒到2秒)。
我更改了你给临时表的表,所以它没有影响我的环境,我删除了外键,因为临时表不支持它们,逻辑与包含的示例相同,但只是针对数据集进行了更改。
DECLARE @startYear INT = 2013, @endYear INT = 2016
DECLARE @yearStartMonth INT = 10 , @yearStartDay INT = 1
DECLARE @startDate DATETIME = DATEADD(day,@yearStartDay-1,
DATEADD(month, @yearStartMonth-1,
DATEADD(year,@startYear-1900,0)))
DECLARE @endDate DATETIME = DATEADD(day,@yearStartDay-1,
DATEADD(month,@yearStartMonth-1,
DATEADD(year,@endYear-1899,0)))
DECLARE @tempDateRangeTable TABLE
(
[Year] INT NOT NULL,
StartDate DATETIME NOT NULL,
EndDate DATETIME NOT NULL
)
DECLARE @currentDate DATETIME = @startDate
WHILE @currentDate < @endDate
BEGIN
DECLARE @nextDate DATETIME = DATEADD(YEAR, 1, @currentDate)
INSERT INTO @tempDateRangeTable(Year,StartDate,EndDate)
VALUES(YEAR(@currentDate),@currentDate,@nextDate)
SET @currentDate = @nextDate
END
CREATE TABLE Users
(
uID int identity primary key,
uFirstName varchar(75),
uLastName varchar(75)
);
INSERT INTO Users (uFirstName, uLastName)
VALUES
('User1', 'User1'),
('User2', 'User2'),
('User3', 'User3'),
('User4', 'User4');
CREATE TABLE UserDataIDMatch
(
udimID int indentity primary key,
udim.udim_FK_uID int foreign key references Users(uID),
udimUserSystemID varchar(75)
);
INSERT INTO UserDataIDMatch (udim_FK_uID, udimUserSystemID)
VALUES
(1, 'SystemID1'),
(2, 'SystemID2'),
(3, 'SystemID3'),
(4, 'SystemID4');
CREATE TABLE DataDump
(
ddID int identity primary key,
ddSystemID varchar(75),
ddEnd datetime
);
INSERT INTO DataDump (ddSystemID, ddEnd)
VALUES
('SystemID1', '10-01-2013'),
('SystemID2', '10-01-2014'),
('SystemID3', '10-01-2015'),
('SystemID4', '10-01-2016');
DECLARE @tempIndividCount TABLE
(
[Year] INT NOT NULL,
UserCount INT NOT NULL
)
-- no longer need to filter out other because you are using an
--inclusion statement rather than an exclusion one, this will
--also make your query faster (when using real tables not temp ones)
INSERT INTO @tempIndividCount(Year,UserCount)
SELECT tdr.Year, COUNT(DISTINCT UId) FROM
Users u JOIN UserDataIDMatch um
ON um.udim_FK_uID = u.uID
JOIN DataDump dd ON
um.udimUserSystemID = dd.ddSystemID
JOIN @tempDateRangeTable tdr ON
dd.ddEnd >= tdr.StartDate AND dd.ddEnd < tdr.EndDate
GROUP BY tdr.Year
-- will show you your result
SELECT * FROM @tempIndividCount
--add any ranges that did not have an entry but were in your range
--can easily remove this by taking this part out.
INSERT INTO @tempIndividCount
SELECT t1.Year,0 FROM
@tempDateRangeTable t1 LEFT OUTER JOIN @tempIndividCount t2
ON t1.Year = t2.Year
WHERE t2.Year IS NULL
SELECT YearNumber,UserCount FROM (
SELECT 'Year'+CAST(((t2.Year-t1.Year)+1) AS CHAR) [YearNumber] ,t2.Year,(
SELECT SUM(UserCount)
FROM @tempIndividCount
WHERE Year >= t1.Year AND Year <=t2.Year
) AS UserCount
FROM @tempIndividCount t1
JOIN @tempIndividCount t2
ON t1.Year < t2.Year
WHERE t1.Year = @startYear
UNION ALL
--add the missing first year, union it to include the value
SELECT 'Year1',Year, UserCount FROM @tempIndividCount
WHERE Year = @startYear) tt
ORDER BY tt.Year
不需要明确确定每年的结束日期和开始日期,就像在逻辑年份只需知道开始日期和结束日期一样。可以通过一些简单的修改轻松改变你想要的东西(即你想要所有的2年范围或3年)。
由于您基于相同的数据类型进行搜索,因此可以使用应在数据库中的日期列上创建的索引。
查询要复杂得多,即使它更健壮,实际查询中还有很多额外的逻辑。
如果数据集非常小,或者比较的日期数量不重要,那么就无法节省足够的时间来进行评估。
答案 1 :(得分:1)
除非我遗漏了某些内容,否则您只想知道日期小于或等于当前会计年度的记录数。
DECLARE @YearOneStart DATETIME, @YearOneEnd DATETIME,
@YearTwoStart DATETIME, @YearTwoEnd DATETIME,
@YearThreeStart DATETIME, @YearThreeEnd DATETIME,
@YearFourStart DATETIME, @YearFourEnd DATETIME
SELECT @YearOneStart = '06/01/2013', @YearOneEnd = '05/31/2014',
@YearTwoStart = '06/01/2014', @YearTwoEnd = '05/31/2015',
@YearThreeStart = '06/01/2015', @YearThreeEnd = '05/31/2016',
@YearFourStart = '06/01/2016', @YearFourEnd = '05/31/2017'
;WITH cte AS
(
SELECT u.uID AS UserID,
CASE
WHEN dd.ddEnd BETWEEN @yearOneStart AND @yearOneEnd THEN 'Year1'
WHEN dd.ddEnd BETWEEN @yearTwoStart AND @yearTwoEnd THEN 'Year2'
WHEN dd.ddEnd BETWEEN @yearThreeStart AND @yearThreeEnd THEN 'Year3'
WHEN dd.ddEnd BETWEEN @yearFourStart AND @yearFourEnd THEN 'Year4'
ELSE 'Other'
END AS [Year]
FROM Users AS u
INNER JOIN UserDataIDMatch AS udim
ON u.uID = udim.udim_FK_uID
INNER JOIN DataDump AS dd
ON udim.udimUserSystemID = dd.ddSystemID
)
SELECT
DISTINCT [Year],
(SELECT COUNT(*) FROM cte cteInner WHERE cteInner.[Year] <= cteMain.[Year] )
FROM cte cteMain
答案 2 :(得分:0)
在SQL Server
内匹配WHEN
内的CASE
时,它会停止评估,不会继续评估下一个WHEN
子句。因此,你无法积累那种方式。
如果我理解正确,这将显示您的结果。
;WITH cte AS
(F
SELECT dd.ddEnd [dateEnd], u.uID AS UserID
FROM Users AS u
INNER JOIN UserDataIDMatch AS udim
ON u.uID = udim.udim_FK_uID
INNER JOIN DataDump AS dd
ON udim.udimUserSystemID = dd.ddSystemID
WHERE ddEnd BETWEEN @FiscalYearStart AND @FiscalYearEnd3
)
SELECT datepart(year, @FiscalYearStart) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN @FiscalYearStart AND @FiscalYearEnd1
GROUP BY @FiscalYearStart
UNION
SELECT datepart(year, @FiscalYearEnd1) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN @FiscalYearStart AND @FiscalYearEnd2
GROUP BY @FiscalYearEnd1
UNION
SELECT datepart(year, @FiscalYearEnd3) AS [Year], COUNT(DISTINCT UserID) AS CntUserID
FROM cte
WHERE dateEnd BETWEEN @FiscalYearStart AND @FiscalYearEnd3
GROUP BY @FiscalYearEnd2