我正在尝试找到仅SQL
的解决方案,以解决与计算连续事件出席率有关的问题。这些事件发生在不同的日期,所以我不能使用任何连续的日期方法来确定连续的出勤。要计算一个人的连续出勤率,我将从最近的事件开始,然后回到过去。我会计算该人参加的每个活动,当我打一个人未参加的活动时,我会停下来。这使我能够统计最近连续参加的活动。当前,所有数据都托管在SQL
表中,下面是带有数据的示例架构:
USERS
ID UserName MinutesWatched
--- -------- --------------
1 jdoe 30
2 ssmith 400
3 bbaker 350
4 tduke 285
EVENTS
ID Name StartDate
-- ----------- ---------
1 1st Event 07/15/2018
2 2nd Event 07/16/2018
3 3rd Event 07/18/2018
4 4th Event 07/20/2018
ATTENDANCE
ID User_ID Event_ID
-- ------- --------
1 1 1
2 1 2
3 1 3
4 1 4
5 2 4
6 2 3
7 3 4
8 3 2
9 3 1
10 4 4
11 4 3
12 4 2
对于输出,我试图获得:
OUTPUT
User_ID Consecutive WatchedMinutes
------- ----------- --------------
1 4 30
2 2 400
3 1 350
4 3 285
我已经构建了C#
代码以迭代的方式进行此操作,但是当我要处理300,000多个用户和数百个事件时,它的速度很慢。我很乐意看到此版本的SQL
。
以下是根据Dan的要求来计算热门事件观看者的方法。输出实际上只是一个列出了Top X事件查看器的字符串。
public string GetUsersTopWatchedConsecutiveStreams(int topUserCount)
{
string results = "Top " + topUserCount + " consecutive viewers - ";
Dictionary<ChatUser, int> userinfo = new Dictionary<ChatUser, int>();
using (StorageModelContext db = new StorageModelContext())
{
IQueryable<ChatUser> allUsers = null;
if (mainViewModel.CurrentStream != null)
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == mainViewModel.CurrentStream.Id).Select(x => x.User);
else
allUsers = db.ViewerHistory.Include("Stream").Include("User").Where(x => x.Stream.Id == (db.StreamHistory.OrderByDescending(s => s.StreamEnd).FirstOrDefault().Id)).Select(x => x.User);
foreach (var u in allUsers)
{
int totalStreams = 0;
var user = db.Users.Include("History").Where(x => x.UserName == u.UserName).FirstOrDefault();
if (user != null)
{
var streams = user.History;
if (streams != null)
{
var allStreams = db.StreamHistory.OrderByDescending(x => x.StreamStart);
foreach (var s in allStreams)
{
var vs = streams.Where(x => x.Stream == s);
if (vs.Count() > 0)
totalStreams++;
else
break;
}
}
}
userinfo.Add(u, totalStreams);
totalStreams = 0;
}
var top = userinfo.OrderByDescending(x => x.Value).ThenByDescending(x => x.Key.MinutesWatched).Take(topUserCount);
int cnt = 1;
foreach (var t in top)
{
results += "#" + cnt + ": " + t.Key + "(" + t.Value.ToString() + "), ";
cnt++;
}
if (cnt > 1)
results = results.Substring(0, results.Length - 2);
}
return results;
}
当没有活动运行时,mainViewModel.CurrentStream为null。直播事件发生时,它将包含一个对象,其中包含与直播事件相关的信息。
答案 0 :(得分:0)
您似乎想要一个人没有参加的最大活动ID,小于该人参加的最大活动ID。然后,您要计算参加者的人数。
以下方法将其处理为:
因此,这使事件具有计数:
select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
order by num_nulls_gt;
剩下的一个子查询应该完成:
select u.user_id, count(*) as num_consecutive
from (select u.user_id,
sum(case when a.event_id is null then e.id end) over (partition by user_id) as max_nonmatch_event_id
from (select user_id, max(event_id) as max_event_id
from attendance
group by user_id
) u join
events e
on e.id <= u.max_event_id left join
attendance a
on a.user_id = u.id and a.event_id = e.id
) ue
where event_id > max_nonmatch_event_id
group by user_id;
答案 1 :(得分:0)
也许您想尝试一下:
事件以降序(由StartDate
获得行号),用户的出勤率也以降序StartDate
获得行号。现在,连续出席的事件编号和出席人数的差异将相同。我使用这些差异进行分组,计算组中的出勤人数,并返回差异最小的组(按用户):
WITH
evt (ID, StartDate, evt_no) AS (
SELECT ID, StartDate,
ROW_NUMBER() OVER (ORDER BY StartDate DESC)
FROM EVENTS
),
att ([User_ID], grp_no) AS (
SELECT [User_ID], evt_no -
ROW_NUMBER() OVER (PARTITION BY [User_ID] ORDER BY StartDate DESC)
FROM ATTENDANCE a
INNER JOIN evt ON a.Event_ID = evt.ID
),
con ([User_ID], Consecutive, rn) AS (
SELECT [User_ID], COUNT(*),
ROW_NUMBER() OVER (PARTITION BY User_ID ORDER BY grp_no)
FROM att
GROUP BY [User_ID], grp_no
)
SELECT u.ID AS [User_ID], u.UserName, u.MinutesWatched, con.Consecutive
FROM con
INNER JOIN USERS u ON con.[User_ID] = u.ID
WHERE con.rn = 1;
会对这个查询在您的系统上运行多长时间感兴趣。