Question

我遇到一个使用datetime编写代码的问题。我创建了一个正在研究的方案。有人可以帮我解决代码上的问题。

输入：

Name, Channel, Duration, Start_time
John, A, 2, 16:00:00
Joseph, B, 3, 15:05:00

输出：

Name, Channel, Duration, Start_time
John, A, 2, 16:00:00
John, A, 2, 16:01:00
Joseph, B, 3, 15:05:00
Joseph, B, 3, 15:06:00
Joseph, B, 3, 15:07:00

谢谢。

enter image description here

Answer 1

使用：

df['Start_time'] = pd.to_timedelta(df['Start_time'])
df = df.loc[df.index.repeat(df['Duration'])]
td = pd.to_timedelta(df.groupby(level=0).cumcount() * 60, unit='s')

df['Start_time'] = df['Start_time'] + td
df = df.reset_index(drop=True)

print (df)
     Name Channel  Duration Start_time
0    John       A         2   16:00:00
1    John       A         2   16:01:00
2  Joseph       B         3   15:05:00
3  Joseph       B         3   15:06:00
4  Joseph       B         3   15:07:00

说明：

对转换列Start_time to_timedelta
然后按列Duration的{{3}}索引值，并按repeat重复行
通过loc根据每个索引值创建计数器，并将其转换为1分钟的时间增量，并将其添加到新的重复列Start_time
最后一个cumcount和参数drop=True用于避免索引值重复

编辑：

如果输出解决方案中的日期时间相同，则仅首先转换值reset_index：

df['Start_time'] = pd.to_datetime(df['Start_time'])
df = df.loc[df.index.repeat(df['Duration'])]
td = pd.to_timedelta(df.groupby(level=0).cumcount() * 60, unit='s')

df['Start_time'] = df['Start_time'] + td
df = df.reset_index(drop=True)
print (df)
     Name Channel  Duration          Start_time
0    John       A         2 2018-11-19 16:00:00
1    John       A         2 2018-11-19 16:01:00
2  Joseph       B         3 2018-11-19 15:05:00
3  Joseph       B         3 2018-11-19 15:06:00
4  Joseph       B         3 2018-11-19 15:07:00

Answer 2

使用-

df['dates'] = df.apply(lambda x: list(pd.date_range(start=x['Start_time'], periods=x['Duration'], freq='1min')), axis=1)
df.set_index(['Name','Channel','Duration', 'Start_time'])['dates'].apply(pd.Series).stack().reset_index().drop(['level_4','Start_time'],1).rename(columns={0:'Start_time'})

输出

    Name    Channel Duration    Start_time
0   John    A   3   2018-11-19 16:00:00
1   John    A   3   2018-11-19 16:01:00
2   John    A   3   2018-11-19 16:02:00
3   Joseph  B   4   2018-11-19 15:05:00
4   Joseph  B   4   2018-11-19 15:06:00
5   Joseph  B   4   2018-11-19 15:07:00
6   Joseph  B   4   2018-11-19 15:08:00

说明

将pd.date_range()应用于Start_time和Duration
使用第二行将其分解为df

使用datetime熊猫按照持续时间创建行

2 个答案: