我有一个由带时间戳的事件组成的数据库:
row eventName taskName timestamp userName
1 fail ABC 10.5 John
2 fail ABC 18.0 John
3 fail ABC 19.0 Mike
4 fail XYZ 21.0 John
5 fail XYZ 23.0 Mike
6 success ABC 25.0 John
7 fail ABC 26.0 John
8 success ABC 28.0 John
我想计算每个用户第一次成功之前的失败次数(和平均值,但这超出了这个问题)。
在上面的例子中,John尝试了ABC任务2次(第1行和第2行),直到成功(第6行)。随后的失败和成功可以忽略不计。
我想我可以通过计算“ABC”和“失败”的行数来实现这一点,其时间戳早于具有“ABC”和“成功”的所有行中的最早时间戳,按userName分组。我如何在T-SQL中表达这一点?具体来说,Vertica。
这似乎与此处的案例非常相似: sql count/sum the number of calls until a specific date in another column
但是当我尝试调整https://stackoverflow.com/a/39594686/4354459中的代码时,我认为我出错了,因为我的计数比预期的要多。
WITH
Successes
AS
(
SELECT
events.userName
,events.taskName
,MIN(events.timestamp) AS FirstSuccessTime
FROM events
WHERE events.eventName = 'success'
GROUP BY events.userName, events.taskName
)
SELECT
events.userName
,events.taskName
,COUNT(events.eventName) AS FailuresUntilFirstSuccess
FROM
Successes
LEFT JOIN events
ON events.taskName = Successes.taskName
AND events.timestamp < Successes.FirstSuccessTime
AND events.eventName = 'fail'
GROUP BY events.userName, events.taskName
;
答案 0 :(得分:1)
根据架构,此查询将为您提供所需内容:
with Failures as
(
select * from Event where event_name = 'fail'
),
Q as
(
select * from Event E
outer apply
(
select count(*) cnt from Failures F
where F.task_name = E.task_name and F.username = E.username and F.ts < E.ts
) F
where E.event_name = 'success'
)
select * from
(
select Q.*,
row_number() over (partition by event_name, task_name, username order by ts) o from Q
) K where K.o = 1
使用您的数据进行测试得出:
id event_name task_name timestamp username cnt
-- ---------- ---------- ---------- --------- ---
6 success ABC 25 John 2
但是,我走得更远,为迈克添加了另一个“成功”行
insert Event select 'success', 'XYZ', 29.0, 'Mike':
并获取
id event_name task_name timestamp username cnt
-- ---------- ---------- ---------- --------- ---
6 success ABC 25 John 2
9 success XYZ 29 Mike 1
正如所料。
第一个CTE会产生一系列故障。第二个CTE是递归的,其中基本情况是成功集合,递归情况是在给定成功之前(相对于用户和任务名称)的失败集合的计数(基数)。
最后,我们对row_number
,event_name
和task_name
上的分区使用username
,以便将给定分区的第一次成功标记为'1 ”。然后我们只过滤掉row_number
不等于'1'的所有行。
答案 1 :(得分:0)
可能有一种更简单的方式来到这里,但我会试着看一下。
测试数据设置
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<html>
<head>
</head>
<body>
<select class="form-control" id="verglobalFIlterDropdwn">
<option value="Open">Open </option>
<option value="Verified">Verified </option>
<option value="Rejected">Rejected </option>
</select>
<table class="table">
<tbody>
<tr>
<td>
<select class="form-control" id="billStat1">
<option value="Open" selected>Open</option>
<option value="Verified">Verified</option>
<option value="Rejected">Rejected</option>
</select>
</td>
</tr>
<tr>
<td>
<select class="form-control" id="billStat2">
<option value="Open" >Open</option>
<option value="Verified" selected>Verified</option>
<option value="Rejected">Rejected</option>
</select>
</td>
</tr>
<tr>
<td>
<select class="form-control" id="billStat3">
<option value="Open" selected>Open</option>
<option value="Verified">Verified</option>
<option value="Rejected">Rejected</option>
</select>
</td>
</tr>
<tr>
<td>
<select class="form-control" id="billStat4">
<option value="Open" selected>Open</option>
<option value="Verified">Verified</option>
<option value="Rejected">Rejected</option>
</select>
</td>
</tr>
</tbody>
</table>
</body>
<script type="text/javascript">
$('#verglobalFIlterDropdwn').on('change', function() {
console.log(this.value);
/*logic to get row of table containing selected values same as values from id='verglobalFIlterDropdwn' selectbox.
case1: If selected values from id='verglobalFIlterDropdwn' selectbox equals already selected values of select in table row, then show it otherwise ,hide other row not containing values of select same as id='verglobalFIlterDropdwn' selectbox
*/
}
</script>
</html>
查询时间
IF OBJECT_ID(N'tempdb..#taskevents', N'U') IS NOT NULL
DROP TABLE #taskevents;
GO
CREATE TABLE #taskevents (
eventName varchar(10)
, taskName varchar(10)
, ts decimal(3,1)
, userName varchar(10)
) ;
INSERT INTO #taskevents ( eventName, taskName, ts, userName )
VALUES
('fail','ABC','10.5','John')
, ('fail','ABC','10.6','John')
, ('fail','ABC','18.0','John')
, ('fail','ABC','22.0','John')
, ('fail','ABC','22.5','John')
, ('success','ABC','25.0','John')
, ('fail','ABC','26.0','John')
, ('success','ABC','28.0','John')
, ('fail','XYZ','10.7','John')
, ('fail','XYZ','21.0','John')
, ('fail','ABC','19.0','Mike')
, ('fail','XYZ','23.0','Mike')
, ('success','XYZ','28.5','Mike')
, ('success','QVC','42.0','Mike')
;
这可以为您提供每位用户的平均失败率。