我有一个像这样的数据帧df1,其中starttime和endtime是datetime对象。
StartTime EndTime
9:08 9:10
9:10 9:35
9:35 9:55
9:55 10:10
10:10 10:20
如果endtime.hour与startime.hour不同,我想分一次这样的时间
StartTime EndTime
9:08 9:10
9:10 9:55
9:55 10:00
10:00 10:10
10:10 10:20
基本上在现有数据框df1中插入一行。我看了很多例子,但还没弄明白怎么做。如果我的问题不明确,请告诉我。
由于
答案 0 :(得分:0)
这就是你想做的......
# load your data into a DataFrame
data="""StartTime EndTime
9:08 9:10
9:10 9:35
9:35 9:55
9:55 10:10
10:10 10:20
"""
from StringIO import StringIO # import from io for Python 3
df = pd.read_csv(StringIO(data), header=0, sep=' ', index_col=None)
# convert strings to Pandas Timestamps (we will ignore the date bit) ...
import datetime as dt
df.StartTime = [dt.datetime.strptime(x, '%H:%M') for x in df.StartTime]
df.EndTime = [dt.datetime.strptime(x, '%H:%M') for x in df.EndTime]
# assumption - all intervals are less than 60 minutes
# - ie. no multi-hour intervals
# add rows
dfa = df[df.StartTime.dt.hour != df.EndTime.dt.hour].copy()
dfa.EndTime = [dt.datetime.strptime(str(x), '%H') for x in dfa.EndTime.dt.hour]
# play with the start hour ...
df.StartTime = df.StartTime.where(df.StartTime.dt.hour == df.EndTime.dt.hour,
other = [dt.datetime.strptime(str(x), '%H') for x in df.EndTime.dt.hour])
# bring back together and sort
df = pd.concat([df, dfa], axis=0) #top/bottom
df = df.sort('StartTime')
# convert the Timestamps to times for easy reading
df.StartTime = [x.time() for x in df.StartTime]
df.EndTime = [x.time() for x in df.EndTime]
和收益
In [40]: df
Out[40]:
StartTime EndTime
0 09:08:00 09:10:00
1 09:10:00 09:35:00
2 09:35:00 09:55:00
3 09:55:00 10:00:00
3 10:00:00 10:10:00
4 10:10:00 10:20:00