我遇到一个使用datetime编写代码的问题。我创建了一个正在研究的方案。有人可以帮我解决代码上的问题。
输入:
Name, Channel, Duration, Start_time
John, A, 2, 16:00:00
Joseph, B, 3, 15:05:00
输出:
Name, Channel, Duration, Start_time
John, A, 2, 16:00:00
John, A, 2, 16:01:00
Joseph, B, 3, 15:05:00
Joseph, B, 3, 15:06:00
Joseph, B, 3, 15:07:00
谢谢。
答案 0 :(得分:0)
使用:
df['Start_time'] = pd.to_timedelta(df['Start_time'])
df = df.loc[df.index.repeat(df['Duration'])]
td = pd.to_timedelta(df.groupby(level=0).cumcount() * 60, unit='s')
df['Start_time'] = df['Start_time'] + td
df = df.reset_index(drop=True)
print (df)
Name Channel Duration Start_time
0 John A 2 16:00:00
1 John A 2 16:01:00
2 Joseph B 3 15:05:00
3 Joseph B 3 15:06:00
4 Joseph B 3 15:07:00
说明:
Start_time
to_timedelta
Duration
的{{3}}索引值,并按repeat
重复行loc
根据每个索引值创建计数器,并将其转换为1分钟的时间增量,并将其添加到新的重复列Start_time
cumcount
和参数drop=True
用于避免索引值重复编辑:
如果输出解决方案中的日期时间相同,则仅首先转换值reset_index
:
df['Start_time'] = pd.to_datetime(df['Start_time'])
df = df.loc[df.index.repeat(df['Duration'])]
td = pd.to_timedelta(df.groupby(level=0).cumcount() * 60, unit='s')
df['Start_time'] = df['Start_time'] + td
df = df.reset_index(drop=True)
print (df)
Name Channel Duration Start_time
0 John A 2 2018-11-19 16:00:00
1 John A 2 2018-11-19 16:01:00
2 Joseph B 3 2018-11-19 15:05:00
3 Joseph B 3 2018-11-19 15:06:00
4 Joseph B 3 2018-11-19 15:07:00
答案 1 :(得分:0)
使用-
df['dates'] = df.apply(lambda x: list(pd.date_range(start=x['Start_time'], periods=x['Duration'], freq='1min')), axis=1)
df.set_index(['Name','Channel','Duration', 'Start_time'])['dates'].apply(pd.Series).stack().reset_index().drop(['level_4','Start_time'],1).rename(columns={0:'Start_time'})
输出
Name Channel Duration Start_time
0 John A 3 2018-11-19 16:00:00
1 John A 3 2018-11-19 16:01:00
2 John A 3 2018-11-19 16:02:00
3 Joseph B 4 2018-11-19 15:05:00
4 Joseph B 4 2018-11-19 15:06:00
5 Joseph B 4 2018-11-19 15:07:00
6 Joseph B 4 2018-11-19 15:08:00
说明
pd.date_range()
应用于Start_time
和Duration
df