时序

Question

此处针对具体案例描述了该问题，但对于许多类似的项目而言，这将是有价值的。

一个名为 month 的pandas.series包含每个样本的月份日期，格式为 int （1,2,3,4，...）。我想把它改成＆＃34; 01,02,03，... 12＆＃34;然后用年份添加它。

使用＆＃34; {0：0 = 2d}＆＃34; .format（a）和循环，可以轻松转换系列值：< / p>

df['date'] = np.nan
for i in range(0,len(df),1):
    df.date.iloc[i] = df.year.iloc[i] +"-"+'%2d'%df.month.values.iloc[i]   
### df.date is a new series contain the year-month('2017-01','2017-02')

但是循环策略是无效的，有没有简单的方法来实现相同的目标？

Answer 1

您可以使用dest：

sizeof

时序

Psidom的方法

％timeit month.astype（str）.str.zfill（2）

10个循环，最佳3：39.1毫秒/循环

此方法：

％timeit month.apply（“{0：0 = 2d}”。格式）

100个循环，每个循环最好为3：7.93 ms

apply

输出：

month.apply("{0:0=2d}".format)

Answer 2

您可以将月转换为str类型，然后使用str.zfill：

month = pd.Series([1,2,12])

month.astype(str).str.zfill(2)

#0    01
#1    02
#2    12
#dtype: object

将它与年份连接起来：

df.year.astype(str) + '-' + df.month.astype(str).str.zfill(2)

Answer 3

您可以对具有相应命名列的数据框使用pd.to_datetime，以创建一系列日期时间对象。

考虑数据框df

df = pd.DataFrame(dict(year=[2011, 2012], month=[3, 4]))
df

   month  year
0      3  2011
1      4  2012

我们所缺少的只是day列。如果我们添加它，我们可以将其传递给pd.to_datetime

pd.to_datetime(df.assign(day=1))

0   2011-03-01
1   2012-04-01
dtype: datetime64[ns]

嗯，这很方便。现在怎么样？

pd.to_datetime(df.assign(day=1)).apply('{:%Y-%m}'.format)

0    2011-03
1    2012-04
dtype: object

或者

pd.to_datetime(df.assign(day=1)).dt.strftime('%Y-%m')

0    2011-03
1    2012-04
dtype: object

制作新专栏

df.assign(year_month=pd.to_datetime(df.assign(day=1)).dt.strftime('%Y-%m'))

   month  year year_month
0      3  2011    2011-03
1      4  2012    2012-04

然而，我们刚刚完成了

df.assign(year_month=df.apply(lambda x: '{year}-{month:02d}'.format(**x), 1))

   month  year year_month
0      3  2011    2011-03
1      4  2012    2012-04

将数字函数应用于pandas.series

3 个答案:

时序