我从Iris dataset制作了一个pandas DataFrame,我想添加一个额外的列调用SpecieID。这意味着Iris-setosa的ID为0,Iris-versicolor,1和Iris-virginica,2。
我尝试了代码:
def create_specie_id():
if iris["Species"] == "Iris-setosa":
ID = 0
elif iris["Species"] == "Iris-versicolor":
ID = 1
elif iris["Species"] == "Iris-virginica":
ID = 2
return ID
iris = iris.assign(SpecieID = lambda x: create_specie_id())
print (iris)
但我收到了以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-58-2abd69ffef4b> in <module>()
10 return ID
11
---> 12 iris = iris.assign(SpecieID = lambda x: create_specie_id())
13
14 print (iris)
C:\Users\masc\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in assign(self, **kwargs)
2495 results = {}
2496 for k, v in kwargs.items():
-> 2497 results[k] = com._apply_if_callable(v, data)
2498
2499 # ... and then assign
C:\Users\masc\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\common.py in _apply_if_callable(maybe_callable, obj, **kwargs)
439 """
440 if callable(maybe_callable):
--> 441 return maybe_callable(obj, **kwargs)
442 return maybe_callable
443
<ipython-input-58-2abd69ffef4b> in <lambda>(x)
10 return ID
11
---> 12 iris = iris.assign(SpecieID = lambda x: create_specie_id())
13
14 print (iris)
<ipython-input-58-2abd69ffef4b> in create_specie_id()
2
3 def create_specie_id():
----> 4 if iris["Species"] == "Iris-setosa":
5 ID = 0
6 elif iris["Species"] == "Iris-versicolor":
C:\Users\masc\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
953 raise ValueError("The truth value of a {0} is ambiguous. "
954 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955 .format(self.__class__.__name__))
956
957 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
如何创建包含SpecieID的列?
答案 0 :(得分:1)
您可以使用numpy.select
:
iris=pd.DataFrame({'Species':['Iris-setosa','Iris-versicolor','Iris-virginica', 'another']})
m1 = iris["Species"] == "Iris-setosa"
m2 = iris["Species"] == "Iris-versicolor"
m3 = iris["Species"] == "Iris-virginica"
iris['ID'] = np.select([m1,m2,m3], [0,1,2], default=-1)
print (iris)
Species ID
0 Iris-setosa 0
1 Iris-versicolor 1
2 Iris-virginica 2
3 another -1
另一种解决方案是dict
使用map
- 如果值未匹配则获取NaN
,因此fillna
添加了astype
:
d = { "Iris-setosa" : 0, "Iris-versicolor":1, "Iris-virginica":2}
iris['ID'] = iris['Species'].map(d).fillna(-1).astype(int)
print (iris)
Species ID
0 Iris-setosa 0
1 Iris-versicolor 1
2 Iris-virginica 2
3 another -1