我在df中有此列:
> df["time"]
0 2007-02-01 22:00:00+00:00
1 2007-02-01 22:00:00+00:00
2 2007-02-01 22:00:00+00:00
3 2007-02-01 22:00:00+00:00
4 2007-02-01 22:00:00+00:00
我想用日,月和年创建三个新列,但我想不出一种方法来提取time column
中的每个列。
答案 0 :(得分:2)
为了不修改您现有的time
列,请使用pd.to_datetime
创建一个单独的日期时间序列,然后使用dt
访问器:
# obtain datetime series:
datetimes = pd.to_datetime(df['time'])
# assign your new columns
df['day'] = datetimes.dt.day
df['month'] = datetimes.dt.month
df['year'] = datetimes.dt.year
>>> df
time day month year
0 2007-02-01 22:00:00+00:00 1 2 2007
1 2007-02-01 22:00:00+00:00 1 2 2007
2 2007-02-01 22:00:00+00:00 1 2 2007
3 2007-02-01 22:00:00+00:00 1 2 2007
4 2007-02-01 22:00:00+00:00 1 2 2007
一种替代方法是在str.split('-')
系列上使用datetime.dt.date
:
datetimes = pd.to_datetime(df['time'])
df[['year','month','day']] = datetimes.dt.date.astype(str).str.split('-',expand=True)
>>> df
time year month day
0 2007-02-01 22:00:00+00:00 2007 02 01
1 2007-02-01 22:00:00+00:00 2007 02 01
2 2007-02-01 22:00:00+00:00 2007 02 01
3 2007-02-01 22:00:00+00:00 2007 02 01
4 2007-02-01 22:00:00+00:00 2007 02 01