我希望有人可以给我一个想法或解决我下面的问题。我一直试图找出如何根据start time
,end time
和period
(或我未提及的任何其他统计数据)找出玩家与另一玩家玩过的百分比。我可以将每个玩家的duration
加在一个数据透视表中,以查看玩家的整个冰上时间但是对于我的生活,我似乎无法弄明白另一个。无论是需要使用R,Excel还是python脚本,我都已经用尽了我的想法。我知道这不是一个直接的脚本问题,但我无法想出一个提出问题的更好的地方。我可以清楚地看到示例1中的Suter和Dumba在我提供的小数据片段中一起玩了两次。但是要在图表上绘制或只是找到百分比是我要求任何想法的地方。以下是我如何访问OnIce数据的2个示例。
示例1.在期间/游戏结束时,我可以获得之前的Line Shift数据。播放器的LastName将在整个df中出现多次。
LastName StartTime EndTime Duration ShiftNumber Period
Foligno 0:00 0:40 0:40 1 1
Suter 0:00 0:40 0:40 1 1
Staal 0:00 0:40 0:40 1 1
Niederreiter 0:00 0:40 0:40 1 1
Dubnyk 0:00 20:00 20:00 1 1
Dumba 0:00 0:40 0:40 1 1
Zucker 0:40 1:26 0:46 1 1
Koivu 0:40 1:34 0:54 1 1
Murphy 0:40 1:26 0:46 1 1
Brodin 0:40 1:26 0:46 1 1
Granlund 0:40 1:39 0:59 1 1
Reilly 1:26 2:09 0:43 1 1
Winnik 1:26 2:18 0:52 1 1
Coyle 1:34 2:16 0:42 1 1
Stewart 1:39 2:13 0:34 1 1
Dumba 2:09 2:39 0:30 2 1
Suter 2:09 2:39 0:30 2 1
示例2.我可以每隔几秒钟运行一个脚本并同时保存到csv,哪个玩家ID是OnIce。
HomePlayerId HomeDuration
8475744 94
8471702 74
8477944 69
8475163 74
8474651 623
8477043 74
HomePlayerId HomeDuration
8475744 111
8471702 91
8477944 86
8475163 91
8474651 640
8477043 91
答案 0 :(得分:1)
下面的普通Python代码计算每对玩家的重叠时间总量。核心思想是给出player1的(start1,end1)间隔和player2的(start2,end2),然后这两个间隔的重叠是
overlap = min(end1, end2) - max(start1, start2)
如果overlap
<= 0,那么这些间隔没有重叠。我们需要为每对玩家的每对间隔执行该计算。
from itertools import combinations, product
#LastName StartTime EndTime Duration ShiftNumber Period
data = '''\
Foligno 0:00 0:40 0:40 1 1
Suter 0:00 0:40 0:40 1 1
Staal 0:00 0:40 0:40 1 1
Niederreiter 0:00 0:40 0:40 1 1
Dubnyk 0:00 20:00 20:00 1 1
Dumba 0:00 0:40 0:40 1 1
Zucker 0:40 1:26 0:46 1 1
Koivu 0:40 1:34 0:54 1 1
Murphy 0:40 1:26 0:46 1 1
Brodin 0:40 1:26 0:46 1 1
Granlund 0:40 1:39 0:59 1 1
Reilly 1:26 2:09 0:43 1 1
Winnik 1:26 2:18 0:52 1 1
Coyle 1:34 2:16 0:42 1 1
Stewart 1:39 2:13 0:34 1 1
Dumba 2:09 2:39 0:30 2 1
Suter 2:09 2:39 0:30 2 1
'''.splitlines()
def to_secs(ms):
''' Convert a mm:ss string to seconds '''
m, s = map(int, ms.split(':'))
return 60 * m + s
# Store a list of (start, end) times for each player
players = {}
for row in data:
name, start, end = row.split(None, 3)[:3]
times = to_secs(start), to_secs(end)
players.setdefault(name, []).append(times)
for t in players.items():
print(t)
print()
# Determine the amount of overlapping time for each pair of players
for p1, p2 in combinations(sorted(players), 2):
total = 0
# Check each pair of times for this pair of players
for t1, t2 in product(players[p1], players[p2]):
# Compute the overlap in this pair of times and
# add it to the total for this pair of players
start, end = zip(t1, t2)
total += max(0, min(end) - max(start))
if total:
print(p1, p2, total)
<强>输出强>
('Foligno', [(0, 40)])
('Suter', [(0, 40), (129, 159)])
('Staal', [(0, 40)])
('Niederreiter', [(0, 40)])
('Dubnyk', [(0, 1200)])
('Dumba', [(0, 40), (129, 159)])
('Zucker', [(40, 86)])
('Koivu', [(40, 94)])
('Murphy', [(40, 86)])
('Brodin', [(40, 86)])
('Granlund', [(40, 99)])
('Reilly', [(86, 129)])
('Winnik', [(86, 138)])
('Coyle', [(94, 136)])
('Stewart', [(99, 133)])
Brodin Dubnyk 46
Brodin Granlund 46
Brodin Koivu 46
Brodin Murphy 46
Brodin Zucker 46
Coyle Dubnyk 42
Coyle Dumba 7
Coyle Granlund 5
Coyle Reilly 35
Coyle Stewart 34
Coyle Suter 7
Coyle Winnik 42
Dubnyk Dumba 70
Dubnyk Foligno 40
Dubnyk Granlund 59
Dubnyk Koivu 54
Dubnyk Murphy 46
Dubnyk Niederreiter 40
Dubnyk Reilly 43
Dubnyk Staal 40
Dubnyk Stewart 34
Dubnyk Suter 70
Dubnyk Winnik 52
Dubnyk Zucker 46
Dumba Foligno 40
Dumba Niederreiter 40
Dumba Staal 40
Dumba Stewart 4
Dumba Suter 70
Dumba Winnik 9
Foligno Niederreiter 40
Foligno Staal 40
Foligno Suter 40
Granlund Koivu 54
Granlund Murphy 46
Granlund Reilly 13
Granlund Winnik 13
Granlund Zucker 46
Koivu Murphy 46
Koivu Reilly 8
Koivu Winnik 8
Koivu Zucker 46
Murphy Zucker 46
Niederreiter Staal 40
Niederreiter Suter 40
Reilly Stewart 30
Reilly Winnik 43
Staal Suter 40
Stewart Suter 4
Stewart Winnik 34
Suter Winnik 9