使用if-then-else语句(或类似方法)生成Pandas系列的最惯用方法是什么?
我有一组凌乱的数据,结构如下:
df = pd.DataFrame({
"label": ["a","b","a","b","a","b"],
"name": ["normal","normal","normal","special","normal","special"],
"value": [1,2,3,4,5,6]
})
我尝试通过在字典中查找label
的值来创建新标签,但是如果name
值为"特殊且想要返回特殊的新标签#34;
我能够使用df.apply:
mapping = {"a": "apple", "b": "banana"}
df["new_label"] = df.apply(
lambda x:"pear" if x['name'] == "special" else mapping[x['label']],
axis=1
)
然而apply
在运行~60k行数据时已经减慢了我的程序速度,而且我预计会有更多。是否有更惯用和矢量化的方式来进行此类操作?
答案 0 :(得分:3)
df["new_label"] = np.where(df['name'] == "special", 'pear', df['label'].map(mapping))
print (df)
label name value new_label
0 a normal 1 apple
1 b normal 2 banana
2 a normal 3 apple
3 b special 4 pear
4 a normal 5 apple
5 b special 6 pear