在这里,我有一个带data的csv文件。我想编写一个代码,它的开始时间是从csv文件时间列的第一时间开始的,它将等于0。然后从该时间开始,一个小时一小时地增加直到下一天的开始时间。然后在那之后,该时间再次变为0,并在下一天开始时间之前增加一小时一小时。此过程继续。
time_interval = 3600 (in seconds)
date_array = []
date_array.append(pd.to_datetime(data['date'][0]).date())
start_time = []
end_time = []
temp_date = pd.to_datetime(data['date'][0]).date()
start_time=0
for i in range(len(data['date'])):
cur_date = pd.to_datetime(data['date'][i]).date()
if( cur_date > temp_date):
end_time.append(pd.to_datetime(data['time'][i-1], format='%H:%M:%S').time())
start_time=0
date_array.append(cur_date)
temp_date = cur_date
end_time.append(pd.to_datetime(data['time'][len(data['date'])-1], format='%H:%M:%S').time())
datetime_array = []
for i in range(len(date_array)):
s_time = start_time
e_time = datetime.datetime.combine(date_array[i], end_time[i])
print(datetime_array)
答案 0 :(得分:2)
这就是您要寻找的
import pandas as pd
df = pd.DataFrame([
["10/3/2018"],
["10/3/2018"],
["10/3/2018"],
["10/3/2018"],
["10/3/2018"],
["10/3/2018"],
["10/4/2018"],
["10/4/2018"],
["10/4/2018"],
["10/4/2018"],
],columns=['date'])
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%Y')
start_time = '6:00:00'
df.loc[:,'time'] = start_time
increment = df.groupby(['date', 'time']).cumcount().astype('timedelta64[h]')
df.loc[:,'time'] = pd.to_timedelta(df.loc[:,'time'])
df['time'] = df['time'] + increment
输出
date time
0 2018-03-10 06:00:00
1 2018-03-10 07:00:00
2 2018-03-10 08:00:00
3 2018-03-10 09:00:00
4 2018-03-10 10:00:00
5 2018-03-10 11:00:00
6 2018-04-10 06:00:00
7 2018-04-10 07:00:00
8 2018-04-10 08:00:00
9 2018-04-10 09:00:00
答案 1 :(得分:1)
您可以创建一个由日期时间填充的新列,然后使用GroupBy.transform
来获取每天的第一个值,用日期时间进行减法,最后用Series.dt.total_seconds
来转换时间增量,然后转换为分钟:
df = pd.DataFrame({
'date':['10/3/2018'] * 5 + ['10/4/2018'],
'time':['6:00:00','7:00:00','8:00:00','9:00:00','10:00:00','6:00:00'],
'col':[4,8,9,4,2,3],
})
df['datetime'] = pd.to_datetime(df['date'] + df['time'], format='%d/%m/%Y%H:%M:%S')
first = df.groupby('date')['datetime'].transform('first')
df['new'] = df['datetime'].sub(first).dt.total_seconds().div(60).astype(int)
print (df)
date time col datetime new
0 10/3/2018 6:00:00 4 2018-03-10 06:00:00 0
1 10/3/2018 7:00:00 8 2018-03-10 07:00:00 60
2 10/3/2018 8:00:00 9 2018-03-10 08:00:00 120
3 10/3/2018 9:00:00 4 2018-03-10 09:00:00 180
4 10/3/2018 10:00:00 2 2018-03-10 10:00:00 240
5 10/4/2018 6:00:00 3 2018-04-10 06:00:00 0
详细信息:
print (first)
0 2018-03-10 06:00:00
1 2018-03-10 06:00:00
2 2018-03-10 06:00:00
3 2018-03-10 06:00:00
4 2018-03-10 06:00:00
5 2018-04-10 06:00:00
Name: datetime, dtype: datetime64[ns]
答案 2 :(得分:0)