我想要一个带有timestamp列的pandas DataFrame,并希望创建一个只有月份的列。我希望月份列包含月份的字符串表示,而不是整数。我做过类似的事情:
df['Dates'] = pd.to_datetime(df['Dates'])
df['Month'] = df.Dates.dt.month
df['Month'] = df.Month.apply(lambda x: datetime.strptime(str(x), '%m').strftime('%b'))
然而,这是一种蛮力方法,并不是非常高效。有没有更优雅的方法将月份的整数表示转换为字符串表示形式?
答案 0 :(得分:8)
在您的日期时间使用向量化dt.strftime
:
In [43]:
df = pd.DataFrame({'dates':pd.date_range(dt.datetime(2016,1,1), dt.datetime(2017,2,1), freq='M')})
df
Out[43]:
dates
0 2016-01-31
1 2016-02-29
2 2016-03-31
3 2016-04-30
4 2016-05-31
5 2016-06-30
6 2016-07-31
7 2016-08-31
8 2016-09-30
9 2016-10-31
10 2016-11-30
11 2016-12-31
12 2017-01-31
In [44]:
df['month'] = df['dates'].dt.strftime('%b')
df
Out[44]:
dates month
0 2016-01-31 Jan
1 2016-02-29 Feb
2 2016-03-31 Mar
3 2016-04-30 Apr
4 2016-05-31 May
5 2016-06-30 Jun
6 2016-07-31 Jul
7 2016-08-31 Aug
8 2016-09-30 Sep
9 2016-10-31 Oct
10 2016-11-30 Nov
11 2016-12-31 Dec
12 2017-01-31 Jan
答案 1 :(得分:1)
对于版本pandas 0.23.0+
,可以使用dt.month_name
:
df['month'] = df['dates'].dt.month_name()
print (df)
dates month
0 2016-01-31 January
1 2016-02-29 February
2 2016-03-31 March
3 2016-04-30 April
4 2016-05-31 May
5 2016-06-30 June
6 2016-07-31 July
7 2016-08-31 August
8 2016-09-30 September
9 2016-10-31 October
10 2016-11-30 November
11 2016-12-31 December
12 2017-01-31 January