在Pandas中,我尝试通过查看包含Year
等日期的列Age
来修改数据框中的列Mon Dec 28 11:19:42 CST 2007
。
ID Age Year
1 Mon Dec 28 11:19:42 CST 2007 NaN
2 Tue Sep 28 12:39:41 CST 2008 NaN
我尝试使用df.loc[df[df.Age.str.contains("2007")], 'Year'] = 2007
执行此操作,但是,这会返回错误ValueError: cannot copy sequence with size 20 to array axis with dimension 11359
预期结果:
ID Age Year
1 Mon Dec 28 11:19:42 CST 2007 2007
2 Tue Sep 28 12:39:41 CST 2008 NaN
df[df['Age'].str.contains("2007")]['Year'] = 2007
也不起作用。任何人都可以帮助我如何正确地做到这一点吗?
提前致谢!
答案 0 :(得分:1)
您可以str.endswith
使用loc
:
df.loc[df.Age.str.endswith("2007"), 'Year'] = 2007
print (df)
ID Age Year
0 1 Mon Dec 28 11:19:42 CST 2007 2007.0
1 2 Tue Sep 28 12:39:41 CST 2008 NaN
df.loc[df.Age.str.contains("2007"), 'Year'] = 2007
print (df)
ID Age Year
0 1 Mon Dec 28 11:19:42 CST 2007 2007.0
1 2 Tue Sep 28 12:39:41 CST 2008 NaN
mask
的另一种可能解决方案:
df.Year = df.Year.mask(df.Age.str.endswith("2007"), 2007)
print (df)
ID Age Year
0 1 Mon Dec 28 11:19:42 CST 2007 2007.0
1 2 Tue Sep 28 12:39:41 CST 2008 NaN