我有这样的df:
Sr. lwd_month lwd_year
1 3 2015
2 6 2018
3. 9 2017
4. NaN NaN
5. 5 2015
如何合并这两列以获得如下所示的数据框?
Sr. lwd_month lwd_Year MonthYear
1 3 2015 03-2015
2 6 2018 06-2018
3. 9 2017 09-2017
4. NaN NaN NaT
5. 5 2015 05-2015
6. 3 NaN NaT
谢谢
答案 0 :(得分:2)
为什么不仅如此:
df['MonthYear'] = pd.to_datetime(df[['Year', 'Month']].assign(Day=1)).dt.strftime('%m-%Y')
print(df)
输出:
Sr. Month Year MonthYear
0 1.0 3.0 2015.0 03-2015
1 2.0 6.0 2018.0 06-2018
2 3.0 9.0 2017.0 09-2017
3 4.0 NaN NaN NaT
4 5.0 5.0 2015.0 05-2015
答案 1 :(得分:1)
首先需要使用小写year
和month
以及熊猫版本0.18.1+
的列名称。
然后使用to_datetime
将by multiple columns转换为strftime
来转换字符串:
df['MonthYear']=pd.to_datetime(df.assign(day=1)[['year','month','day']]).dt.strftime('%m-%Y')
print (df)
Sr. month year MonthYear
0 1.0 3.0 2015.0 03-2015
1 2.0 6.0 2018.0 06-2018
2 3.0 9.0 2017.0 09-2017
3 4.0 NaN NaN NaT
4 5.0 5.0 2015.0 05-2015
print (type(df.loc[0, 'MonthYear']))
<class 'str'>
类似月度期间使用to_period
:
df['MonthYear'] = pd.to_datetime(df.assign(day=1)[['year','month','day']]).dt.to_period('m')
print (df)
Sr. month year MonthYear
0 1.0 3.0 2015.0 2015-03
1 2.0 6.0 2018.0 2018-06
2 3.0 9.0 2017.0 2017-09
3 4.0 NaN NaN NaT
4 5.0 5.0 2015.0 2015-05
print (type(df.loc[0, 'MonthYear']))
<class 'pandas._libs.tslibs.period.Period'>