假设我有一个如下数据框:
In [42]: df
Out[42]:
regiment company name preTestScore postTestScore
0 Nighthawks 1st Miller 4 25
1 Nighthawks 1st Jacobson 24 94
2 Nighthawks 2nd Ali 31 57
3 Nighthawks 2nd Milner 2 62
4 Dragoons 1st Cooze 3 70
5 Dragoons 1st Jacon 4 25
6 Dragoons 2nd Ryaner 24 94
7 Dragoons 2nd Sone 31 57
8 Scouts 1st Sloan 2 62
9 Scouts 1st Piger 3 70
10 Scouts 2nd Riani 2 62
11 Scouts 2nd Ali 3 70
所以我做的是:
我按如下方式列出了元组:
In [48]: s = [('Nighthawks', '1st', 'Miller'), ('Scouts', '2nd', 'Ali')]
当我做In [40]: df.loc[s]
我得到了一个KeyError
我只是想做随意的事情,并被困在这里。为什么我不能根据元组中包含的信息提取行?
答案 0 :(得分:1)
关键错误是因为loc
期望索引作为第一个参数。你传递了整个记录......?这不会起作用。
这有效:
print(df.loc[:4])
regiment company name preTestScore postTestScore
0 Nighthawks 1st Miller 4 25
1 Nighthawks 1st Jacobson 24 94
2 Nighthawks 2nd Ali 31 57
3 Nighthawks 2nd Milner 2 62
4 Dragoons 1st Cooze 3 70
这不是:
print(df.loc[s[:4]])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-624-7f654aad4cfd> in <module>()
----> 1 df.loc[s[:4]]
请注意,如果您尝试按位置索引检索行,则最好使用df.iloc
。
解决您的评论,您应解压缩并使用df.isin
:
x, y, z = zip(*[('Nighthawks', '1st', 'Miller'), ('Dragoons', '2nd', 'Cooze')])
out = df[df.regiment.isin(x) & df.company.isin(y) & df.name.isin(z)]
print(out)
regiment company name preTestScore postTestScore
0 Nighthawks 1st Miller 4 25
4 Dragoons 1st Cooze 3 70
并且,使用否定~
操作的反转:
out = df[~(df.regiment.isin(x) & df.company.isin(y) & df.name.isin(z))]
print(out)
regiment company name preTestScore postTestScore
1 Nighthawks 1st Jacobson 24 94
2 Nighthawks 2nd Ali 31 57
3 Nighthawks 2nd Milner 2 62
5 Dragoons 1st Jacon 4 25
6 Dragoons 2nd Ryaner 24 94
7 Dragoons 2nd Sone 31 57
8 Scouts 1st Sloan 2 62
9 Scouts 1st Piger 3 70
10 Scouts 2nd Riani 2 62
11 Scouts 2nd Ali 3 70