我是一名python用户。 有这样的优秀:
time size
2017-08-16 00:00:00 12
2017-08-16 00:01:00 12
2017-08-16 00:02:00 24
2017-08-16 00:03:00 24
2017-08-16 00:04:00 36
2017-08-16 00:05:00 24
2017-08-16 00:06:00 36
2017-08-16 00:07:00 24
2017-08-16 00:08:00 24
2017-08-16 00:09:00 24
想要计算出最近相同数字之间的时间跨度,如下所示:
time size timespan
2017-08-16 00:00:00 12 0
2017-08-16 00:01:00 12 60
2017-08-16 00:02:00 24 0
2017-08-16 00:03:00 24 60
2017-08-16 00:04:00 36 0
2017-08-16 00:05:00 24 0
2017-08-16 00:06:00 36 0
2017-08-16 00:07:00 24 0
2017-08-16 00:08:00 24 0
2017-08-16 00:09:00 24 120
请注意中间的数字24被忽略。 可以用在熊猫中是最好的。
答案 0 :(得分:1)
这里我假设你先将excel文件导出到csv,比如time.csv
time,size
2017-08-16 00:00:00, 12
2017-08-16 00:01:00, 12
2017-08-16 00:02:00, 24
2017-08-16 00:03:00, 24
2017-08-16 00:04:00, 36
2017-08-16 00:05:00, 24
2017-08-16 00:06:00, 36
2017-08-16 00:07:00, 24
2017-08-16 00:08:00, 24
2017-08-16 00:09:00, 24
,解决方案如下。主要思想是,当size
与前一个相同但不同于下一个时,需要计算结果值。
import pandas as pd
from datetime import datetime
a = pd.read_csv('time.csv')
times = [datetime.strptime(x, '%Y-%m-%d %H:%M:%S') for x in a['time']]
aa = list(a['size']) + [None]
res = [0] * len(a)
prev = None
for i, x in enumerate(a['size']):
if x != prev:
begin_time = times[i]
elif x != aa[i + 1]:
res[i] = (times[i] - begin_time).seconds
prev = x
print res
输出为[0, 60, 0, 60, 0, 0, 0, 0, 0, 120]