我的数据本质上是一系列圈数,每圈都有自己的经过时间,但我想计算总的经过时间。
这里有一些具有类似数据的代码:
import pandas as pd
import numpy as np
laptime = pd.Series([1,2,3,4,5,1,2,3,4,5,1,2,3,4,5])
lap = pd.Series([1,1,1,1,1,2,2,2,2,2,3,3,3,3,3])
timeblocks = pd.DataFrame({'laptime': laptime, 'lap': lap})
timeblocks['timediff'] = timeblocks.laptime.diff()
timeblocks['elapsed'] =
pd.Series([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])
timeblocks
结果数据如下:
lap laptime timediff elapsed
0 1 1 NaN 1
1 1 2 1.0 2
2 1 3 1.0 3
3 1 4 1.0 4
4 1 5 1.0 5
5 2 1 -4.0 6
6 2 2 1.0 7
7 2 3 1.0 8
8 2 4 1.0 9
9 2 5 1.0 10
10 3 1 -4.0 11
11 3 2 1.0 12
12 3 3 1.0 13
13 3 4 1.0 14
14 3 5 1.0 15
经过的时间是我实际需要计算的。我尝试了各种形式的时间差异和cumsum,但有点卡住了。
现实世界数据看起来更像以下内容:
113.81201171875 1
113.86206054688 1
113.912109375 1
113.96215820313 1
0.05126953125 2
0.101318359375 2
0.1513671875 2
在现实世界数据的情况下,采样率约为0.05秒。
答案 0 :(得分:0)
import io, operator, itertools
假设数据位于文本文件或多行字符串中:
s = '''113.81201171875 1
113.86206054688 1
113.912109375 1
113.96215820313 1
0.05126953125 2
0.101318359375 2
0.1513671875 2'''
f = io.StringIO(s)
将数据收集到列表中;然后按时间排序列表;将数据分组并提取最大和最小时间;计算经过的单圈时间; acumulate。
data = []
for line in f:
time, lap = map(float, line.strip().split())
data.append((time, lap))
lap = operator.itemgetter(1)
time = operator.itemgetter(0)
data.sort(key = operator.itemgetter(1,0))
total = 0
for el, times in itertools.groupby(data, lap):
low, *_, high = map(time, times)
elapsed = high - low
print(f'lap {el}, elapsed time: {elapsed}')
total += elapsed
print(f'total elapsed time: {total}')
>>>
lap 1.0, elapsed time: 0.15014648438000222
lap 2.0, elapsed time: 0.10009765625
total elapsed time: 0.2502441406300022
>>>