我有类似下面的Data_df示例数据框的数据。我想知道是否有一种方法可以为一个时间戳字段(例如“ start_timestamp”字段)中的所有时间维度创建新列。我想根据“ start_timestamp”列为年,月,日,小时,分钟创建新列。我知道我可以手动为每个时间维度编写代码,但是我想知道是否有一种方法可以检查时间戳并自动创建它们。
Data_df:
Unnamed: 0 call_history_id calllog_id \
0 16358 1210746736 ca58d850-6fe6-4673-a049-ea4a2d8d7ecf
1 16361 1210976828 c005329b-955d-4d88-98a5-1c47e6a1cb80
2 16402 1217791595 050e9b83-54c2-4c87-abdd-32225c0d3189
3 16471 1228495414 45705ed1-a8e2-4a15-8941-5b0a40b7d409
4 27906 1245173592 04e56818-04a0-4704-ac86-31c31dac2370
call_id connection_id pbx_name pbx_id extension_number \
0 1.509170e+12 1.509170e+12 sales8x8 sales8x8 595
1 1.509170e+12 1.509170e+12 sales8x8 sales8x8 595
2 1.509170e+12 1.509170e+12 sales8x8 sales8x8 595
3 1.509170e+12 1.509170e+12 sales8x8 sales8x8 595
4 1.509170e+12 1.509170e+12 sales8x8 sales8x8 595
extension_id customer_id address name \
0 595 2.525100e+29 14086694428 Sun Basket
1 595 2.525100e+29 13214371589 PEREZ,BRYAN
2 595 2.525100e+29 14088566290 14088566290
3 595 2.525100e+29 8059316676 Dialing
4 595 2.525100e+29 12028071151 Implementation Team
start_timestamp direction call_internal call_missed duration \
0 1/8/18 19:49 I 0.0 0.0 4414.0
1 1/8/18 20:09 I 0.0 0.0 8300.0
2 1/9/18 20:31 I 0.0 0.0 14766.0
3 1/11/18 17:16 I 0.0 0.0 1686.0
4 1/15/18 22:55 I 0.0 0.0 3491.0
device_model group_call group_name group_number device_id \
0 mediaserver 0.0 N N MasterSlaveService
1 mediaserver 0.0 N N MasterSlaveService
2 mediaserver 0.0 N N MasterSlaveService
3 mediaserver 0.0 N N MasterSlaveService
4 mediaserver 0.0 N N MasterSlaveService
history_event_state created_time updated_time group_type
0 A 1/8/18 19:49 1/8/18 19:49 N
1 A 1/8/18 20:09 1/8/18 20:09 NaN
2 A 1/9/18 20:31 1/9/18 20:31 N
3 A 1/11/18 17:16 1/11/18 17:16 N
4 A 1/15/18 22:55 1/15/18 22:55 N
更新:
def ts_periods(f_nm, d_list, d_df):
t_df=d_df.copy()
for i in d_list:
if i=='year':
t_df[f_nm+'_Year']=pd.DatetimeIndex(t_df[f_nm]).year
elif i=='month':
t_df[f_nm+'_month']=pd.DatetimeIndex(t_df[f_nm]).month
elif i=='weekday':
t_df[f_nm+'_weekday']=pd.DatetimeIndex(t_df[f_nm]).weekday_name
elif i=='week' in d_list:
t_df[f_nm+'_week']=pd.DatetimeIndex(t_df[f_nm]).week
elif i=='hour':
t_df[f_nm+'_hour']=pd.DatetimeIndex(t_df[f_nm]).hour
elif i=='minute':
t_df[f_nm+'_minute']=pd.DatetimeIndex(t_df[f_nm]).minute
return t_df
答案 0 :(得分:0)
使用数据和.dt
访问器的简短示例。我们首先将数据转换为熊猫时间戳,然后访问所需的维度:
import pandas as pd
data = pd.DataFrame(
{
'time_stamp': ['1/8/18 19:49', '1/9/18 20:31', '1/11/18 17:16']
}
)
data['time_stamp'] = pd.to_datetime(data['time_stamp'], dayfirst = True)
data['day_of_week'] = data['time_stamp'].dt.weekday
data['hour_of_day'] = data['time_stamp'].dt.hour
print(data)
礼物:
time_stamp day_of_week hour_of_day
0 2018-08-01 19:49:00 2 19
1 2018-09-01 20:31:00 5 20
2 2018-11-01 17:16:00 3 17
文档:https://pandas.pydata.org/pandas-docs/stable/basics.html#basics-dt-accessors