这是问题here的后续跟进:
如何使用函数修改数据框?假设我想对.upper()
a
df = pd.DataFrame({'a':['london','newyork','berlin'],
'b':['uk','usa','germany'],
'c':[7,8,9]})
df1 = df[['a', 'b']]
def doSomething(x):
return x.a
print (df1.apply(doSomething, axis=1))
0 london
1 newyork
2 berlin
dtype: object
call `.upper()` on values in `a`:
return
0 LONDON
1 NEWYORK
2 BERLIN
dtype: object
答案 0 :(得分:6)
您可以为列a
调用函数:
def doSomething(x):
return x.upper()
print (df1.a.apply(doSomething))
0 LONDON
1 NEWYORK
2 BERLIN
Name: a, dtype: object
print (df1.a.apply(lambda x: x.upper()))
0 LONDON
1 NEWYORK
2 BERLIN
Name: a, dtype: object
它也适用于:
def doSomething(x):
return x.a.upper()
print (df1.apply(doSomething, axis=1))
0 LONDON
1 NEWYORK
2 BERLIN
dtype: object
但更好的是使用str.upper
,它与NaN
值完美配合:
print (df1.a.str.upper())
0 LONDON
1 NEWYORK
2 BERLIN
Name: a, dtype: object
如果需要添加新列:
df['c'] = df1.a.str.upper()
print (df)
a b c
0 london uk LONDON
1 newyork usa NEWYORK
2 berlin germany BERLIN