Question

我发现的大部分信息都不在python＆gt; pandas＆gt; dataframe中，因此问题。

我想将1到12之间的整数转换为一个明确的月份名称。

我有一个df，看起来像：

   client Month
1  sss    02
2  yyy    12
3  www    06

我希望df看起来像这样：

   client Month
1  sss    Feb
2  yyy    Dec
3  www    Jun

Answer 1

您可以通过合并calendar.month_abbr和df[col].apply()

来有效地执行此操作

import calendar
df['Month'] = df['Month'].apply(lambda x: calendar.month_abbr[x])

Answer 2

假设我们有一个这样的DF，并且Date已经采用DateTime格式：

df.head(3)


            value   
date        
2016-05-19  19736   
2016-05-26  18060   
2016-05-27  19997

然后我们可以像这样轻松提取月份编号和月份名称：

df['month_num'] = df.index.month
df['month'] = df.index.month_name()


            value   year    month_num  month
date                
2017-01-06  37353   2017    1          January
2019-01-06  94108   2019    1          January
2019-01-05  77897   2019    1          January
2019-01-04  94514   2019    1          January

Answer 3

这样做的一种方法是使用数据框中的apply方法，但要做到这一点，您需要一张地图来转换月份。您可以使用函数/字典或使用Python自己的日期时间来执行此操作。

使用日期时间，它将类似于：

def mapper(month):
    date = datetime.datetime(2000, month, 1)  # You need a dateobject with the proper month
    return date.strftime('%b')  # %b returns the months abbreviation, other options [here][1]

df['Month'].apply(mapper)

以类似方式，您可以为自定义名称构建自己的地图。它看起来像这样：

months_map = {01: 'Jan', 02: 'Feb'}
def mapper(month):
    return months_map[month]

显然，您不需要明确定义此函数，并且可以直接在apply方法中使用lambda。

Answer 4

您可以使用列应用轻松完成此操作。

import pandas as pd

df = pd.DataFrame({'client':['sss', 'yyy', 'www'], 'Month': ['02', '12', '06']})

look_up = {'01': 'Jan', '02': 'Feb', '03': 'Mar', '04': 'Apr', '05': 'May',
            '06': 'Jun', '07': 'Jul', '08': 'Aug', '09': 'Sep', '10': 'Oct', '11': 'Nov', '12': 'Dec'}

df['Month'] = df['Month'].apply(lambda x: look_up[x])
df

  Month client
0   Feb    sss
1   Dec    yyy
2   Jun    www

Answer 5

使用strptime和lambda功能：

from time import strptime
df['Month'] = df['Month'].apply(lambda x: strptime(x,'%b').tm_mon)

Answer 6

由于缩写的月份名称是其全名的前三个字母，因此我们可以先将Month列转换为datetime，然后使用dt.month_name()获得完整的月份名称，并最后使用str.slice()方法来获取前三个字母，全部使用熊猫，并且仅使用一行代码：

df['Month'] = pd.to_datetime(df['Month'], format='%m').dt.month_name().str.slice(stop=3)

df

  Month client
0   Feb sss
1   Dec yyy
2   Jun www

Answer 7

calendar模块很有用，但是calendar.month_abbr类似于数组：它不能以向量化方式直接使用。为了获得有效的映射，您可以构造一个字典，然后使用pd.Series.map：

import calendar
d = dict(enumerate(calendar.month_abbr))
df['Month'] = df['Month'].map(d)

性能基准测试显示约130倍的性能差异：

import calendar

d = dict(enumerate(calendar.month_abbr))
mapper = calendar.month_abbr.__getitem__

np.random.seed(0)
n = 10**5
df = pd.DataFrame({'A': np.random.randint(1, 13, n)})

%timeit df['A'].map(d)       # 7.29 ms per loop
%timeit df['A'].map(mapper)  # 946 ms per loop

Answer 8

已经在大型数据集上测试了所有这些方法，发现以下方法最快：

import calendar
def month_mapping():
    # I'm lazy so I have a stash of functions already written so
    # I don't have to write them out every time. This returns the
    # {1:'Jan'....12:'Dec'} dict in the laziest way...
    abbrevs = {}
    for month in range (1, 13):
        abbrevs[month] = calendar.month_abbr[month]
    return abbrevs

abbrevs = month_mapping()

df['Month Abbrev'} = df['Date Col'].dt.month.map(mapping)

Answer 9

def mapper(month):
   return month.strftime('%b') 

df['Month'] = df['Month'].apply(mapper)

参考：

http://strftime.org/

Answer 10

您可以使用熊猫month_name()功能。有关更多详细信息，请访问this link。示例：

>>> idx = pd.date_range(start='2018-01', freq='M', periods=3)
>>> idx
DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31'],
              dtype='datetime64[ns]', freq='M')
>>> idx.month_name()
Index(['January', 'February', 'March'], dtype='object')

python / pandas：将月份int转换为月份名称

10 个答案: