我有一个宠物名字和主人的数据集:
Pets Owners
dog James
dog Katelyn
rat Shelly
cat Bob
我希望能够在“所有者”列中进行搜索以找到名称Katelyn,然后为给定所有者打印出矢量名称。到目前为止,我有这个:
def pet_name():
owner = input("What is the Owner name? ")
# check to see if owner exist in pets dataset
# if ownderID exist then print corresponding pet names
if owner in pets['Owners']:
print( pets[['Pets','Owners']][pets.Owners == owner])
# if ownerID doesnt' exist
elif not age:
print("Sorry, this Owner doesn't exist. Try again! ")
# if no ownerID has been entered at all
else:
print("You didn't enter any Owner. Try again! ")
当我输入要搜索的名称时,它会自动转到代码的else部分。我怎样才能解决这个问题?我应该使用itterows()吗?
答案 0 :(得分:0)
在检查owner in pets['Owners']
是否在字典上下文中使用pets
时,它会检查owner
是否在pets
的索引中。而是检查是否owner in pets['Owners'].values
也就是说,我宁愿看到pet_name
这样写:
def pet_name():
owner = input("What is the Owner name? ")
# check to see if owner exist in pets dataset
# if ownderID exist then print corresponding pet names
mask = pets['Owners'] == owner
if mask.any():
print(pets.loc[mask, ['Pets', 'Owners']])
# if ownerID doesnt' exist
elif not age:
print("Sorry, this Owner doesn't exist. Try again! ")
# if no ownerID has been entered at all
else:
print("You didn't enter any Owner. Try again! ")
答案 1 :(得分:0)
首先,让我们看看问题出在哪里,然后我们找到解决问题的方法。
In [1]: import pandas as pd
In [2]: pets = pd.read_csv('pets.csv')
In [3]: pets
Out[3]:
Pets Owners
0 dog James
1 dog Katelyn
2 rat Shelly
3 cat Bob
In [4]: type(pets["Owners"])
Out[4]: pandas.core.series.Series
我们可以看到pets
是pandas.Series
对象。现在问题显然出在以下代码行中:
if owner in pets['Owners']:
这就是you can't use in
operator with pandas.Series
的原因,但基本上是因为Pandas
的开发人员并未以可能使用“ Membership test operations”的方式实现此模块。因此,正如您自己提到的那样,它将始终返回False
:
In [5]: owner in pets["Owners"]
Out[5]: False
现在,如果您想使用pets["Owners"]
,可以这样做(如@piRSquared的建议):
In [6]: owner in pets["Owners"].values
Out[6]: True
但是,如果我们查看pandas.Series.values
的文档:
警告:我们建议使用Series.array或Series.to_numpy(), 取决于您是否需要参考基础数据还是 NumPy数组。
所以我们可以这样做:
In [7]: owner in pets["Owners"].array
Out[7]: True
还有一种更好的方法,您是否想找出“给定主人的宠物”,对吗?如果是这样,您可以这样做:
In [8]: pet = pets.loc[pets["Owners"] == owner, "Pets"]
In [8]: if pet.any():
...: print(pet)
...: else:
...: print("You didn't enter any Owner. Try again! ")
Out[8]:
1 dog
Name: Pets, dtype: object
如您所见,这将打印一个pandas.Series
对象。您有mentioned格式的“向量/列表/数组”。尚不清楚,但我认为情况是owner
可以有多个pets
,并且您想检查owner
是否有任何pets
,然后打印<列表类型的pets
中的所有em> 。如果是这样,您可以使用pet.array
。例如,如果我们修改您的数据集,以使 Katelyn 拥有不止一只宠物:
Pets Owners
dog James
dog Katelyn
rat Katelyn <-----
rat Shelly
cat Bob
然后我们可以看到它为我们提供了一个列表:
In [9]: if pet.any():
...: print(pet.array)
...: else:
...: print("You didn't enter any Owner. Try again! ")
Out[9]:
<PandasArray>
['dog', 'rat']
Length: 2, dtype: object