如果我的客户不止一次在30天内回复同一项调查,我只想算一次。有人可以告诉我代码吗?
create table #Something ( CustID Char(10), SurveyId char(5), ResponseDate datetime ) insert #Something select 'Cust1', '100', '5/6/13' union all select 'Cust1', '100', '5/13/13' union all select 'Cust2', '100', '4/20/13' union all select 'Cust2', '100', '5/22/13' select distinct custid, SurveyId, Count(custid) as CountResponse from #Something group by CustID, SurveyId
上面的代码只给出了响应的总计数,不确定如何编码每30天只计算一次。
我正在寻找的输出应该是这样的:
CustomerID SurveyId CountResponse Cust1 100 1 Cust2 100 2
答案 0 :(得分:0)
我不是SQL Server的人,但是在Oacle中,如果你从'date'中减去整数值,你实际上会减去“days”,所以这样的东西可以工作:
SELECT custid, surveyid
FROM Something a
WHERE NOT EXISTS (
SELECT 1
FROM Something b
WHERE a.custid = b.custid
AND a.surveyid = b.surveyid
AND b.responseDate between a.responseDate AND a.responseDate - 30
);
要获得你的计数(如果我想要你的要求):
-- Count of times custID returned surveyID, not counting same
-- survey within 30 day period.
SELECT custid, surveyid, count(*) countResponse
FROM Something a
WHERE NOT EXISTS (
SELECT 1
FROM Something b
WHERE a.custid = b.custid
AND a.surveyid = b.surveyid
AND b.responseDate between a.responseDate AND a.responseDate - 30
)
GROUP BY custid, surveyid
更新:根据下面提到的情况,这实际上不太有效。您应该做的是遍历您的something
表并插入要保留在results
表中的调查的行,然后与results
表进行比较以查看是否已经存在是您在过去30天内收到的一项调查。我可以告诉你如何在oracle PL / SQL中做这样的事情,但我不知道SQL服务器的语法。也许知道sql server的其他人想要窃取这个策略来为你编写答案,或者这足以让你继续。
答案 1 :(得分:0)
叫我疯狂和疯狂,但我会通过在每次调查中存储更多状态来解决这个问题。我将采用的方法是添加bit
类型列,指示是否应计算特定调查(即Countable
列)。这解决了跟踪解决此关系中固有的状态问题。
我会在插入时将Countable
中的值设置为1,如果在前30天内找不到具有相同CustID
/ SurveyId
的调查Countable
设置为1.我将其设置为0,否则。
然后问题就变得很容易解决了。只需按CustID
/ SurveyId
分组,然后总结Countable
列中的值。
这种方法的一个警告是,它强制要求必须按时间顺序添加调查,如果不重新计算Countable
值,则无法删除。
答案 2 :(得分:0)
下面的代码是一种产生示例输出的方法。但是,如果您添加select 'Cust1', '100', '4/20/13'
,结果仍为Cust1 100 1
,因为它们都在每个调查回复的30天内,因此只计算第一个。这是理想的行为吗?
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM #SurveysTaken
WHERE (NOT EXISTS
(SELECT 1
FROM #SurveysTaken AS PriorSurveys
WHERE (CustID = #SurveysTaken.CustID)
AND (SurveyId = #SurveysTaken.SurveyId)
AND (ResponseDate >= DATEADD(d, - 30, #SurveysTaken.ResponseDate))
AND (ResponseDate < #SurveysTaken.ResponseDate)))
GROUP BY CustID, SurveyID
或者,您可以将年份分为任意30天,每年重新设定一次。
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, YEAR(ResponseDate) AS RepsonseYear,
DATEPART(DAYOFYEAR, ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
你也可以按月去。
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, YEAR(ResponseDate) AS ResponseYear,
MONTH(ResponseDate) AS ResponseMonth
FROM #SurveysTaken) AS SurveysByMonth
GROUP BY CustID, SurveyID
您可以使用任意纪元日期的30天时段。 (也许通过拉出调查首次从另一个查询创建的日期?)
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, DATEDIFF(D, '1/1/2013', ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
任意三十个时期的最后一个变化是在客户第一次对相关调查作出回应时作出基础。
SELECT CustID, SurveyID, COUNT(*) AS CountResponse
FROM (SELECT DISTINCT CustID, SurveyID, DATEDIFF(DAY,
(SELECT MIN(ResponseDate)
FROM #SurveysTaken AS FirstSurvey
WHERE (CustID = #SurveysTaken.CustID)
AND (SurveyId = #SurveysTaken.SurveyId)), ResponseDate) / 30 AS ThirtyDayPeriod
FROM #SurveysTaken) AS SurveysByPeriod
GROUP BY CustID, SurveyID
您遇到的一个问题是时期/句号技巧,即计算的调查每个时段只发生一次,但不一定相隔30天。
答案 3 :(得分:0)
我相信这是处理它的一种方法。我测试得很快,并且它对记录的小样本起作用,所以我希望它会帮助你。祝你好运。
SELECT s.CustID, COUNT(s.SurveyID) AS SurveyCount
FROM #something s
INNER JOIN (SELECT CustID, SurveyId, ResponseDate
FROM (SELECT #Something.*,
ROW_NUMBER() OVER (PARTITION BY custid ORDER BY ResponseDate ASC) AS RN
FROM #something) AS t
WHERE RN = 1 ) f ON s.CustID = f.CustID
WHERE s.ResponseDate BETWEEN f.ResponseDate AND f.ResponseDate+30
GROUP BY s.CustID
HAVING COUNT(s.SurveyID) > 1
答案 4 :(得分:0)
你的问题含糊不清,这可能是你困难的根源。
insert #Something values
('Cust3', '100', '1/1/13'),
('Cust3', '100', '1/20/13'),
('Cust3', '100', '2/10/13')
Cust3的计数是1还是2? “2/10/13”回复是否无效,因为它在“1/20/13”回复后不到30天?或者'2/10/13'响应是否有效,因为'1/20/13'被'1/1/13'响应无效,因此在之前的有效响应后超过30天?
答案 5 :(得分:0)
根据理论,您希望自第一次提交调查以来计算期限为30天,这是一个(粗略)解决方案。
declare @Something table
(
CustID Char(10),
SurveyId char(5),
ResponseDate datetime
)
insert @Something
select 'Cust1', '100', '5/6/13' union all
select 'Cust1', '100', '5/13/13' union all
select 'Cust1', '100', '7/13/13' union all
select 'Cust2', '100', '4/20/13' union all
select 'Cust2', '100', '5/22/13' union all
select 'Cust2', '100', '7/20/13' union all
select 'Cust2', '100', '7/24/13' union all
select 'Cust2', '100', '9/28/13'
--SELECT CustID,SurveyId,COUNT(*) FROM (
select a.CustID,a.SurveyId,b.ResponseStart,--CONVERT(int,a.ResponseDate-b.ResponseStart),
CASE
WHEN CONVERT(int,a.ResponseDate-b.ResponseStart) > 30
THEN ((CONVERT(int,a.ResponseDate-b.ResponseStart))-(CONVERT(int,a.ResponseDate-b.ResponseStart) % 30))/30+1
ELSE 1
END CustomPeriod -- defines periods 30 days out from first entry of survey
from @Something a
inner join
(select CustID,SurveyId,MIN(ResponseDate) ResponseStart
from @Something
group by CustID,SurveyId) b
on a.SurveyId=b.SurveyId
and a.CustID=b.CustID
group by a.CustID,a.SurveyId,b.ResponseStart,
CASE
WHEN CONVERT(int,a.ResponseDate-b.ResponseStart) > 30
THEN ((CONVERT(int,a.ResponseDate-b.ResponseStart))-(CONVERT(int,a.ResponseDate-b.ResponseStart) % 30))/30+1
ELSE 1
END
--) x GROUP BY CustID,SurveyId
至少你可能想让CASE语句成为一个函数,所以它读得更清晰一些。更好的是在单独的表中定义显式窗口。如果您想避免在第一期结束时返回的调查情况,以及几天后的第二期调查结果,这可能是不可行的。
如果可能,您应该考虑在输入时处理此问题。例如,如果您要在在线调查中识别客户,请拒绝填写调查的尝试。或者,如果有人将这些邮件发送到,请让数据录入人员在30天内将其拒绝。
或者,沿着与“狂野和疯狂”相同的行,添加一个位和一个INSERT触发器。如果在该时间段内没有找到该客户的那种类型的调查,则只需打开该位。
总的来说,更彻底地表达问题会有所帮助。但是我很欣赏实际的编码示例。