如何在熊猫中查找所有满足条件的行

时间:2020-07-13 22:49:08

标签: python pandas dataframe

import numpy as np
import pandas as pd

df = pd.read_csv('Salaries.csv',engine='python')

print( df[ df['JobTitle'].value_counts()==1 ] )

如果JobTitle中的Job出现一次,我想获取该行。

但是,我不断收到此错误: pandas.core.indexing.IndexingError:作为索引器提供了不可对齐的布尔系列(布尔系列的索引与索引对象的索引不匹配)。

这是Salaries.csv文件:

Id,EmployeeName,JobTitle,BasePay,OvertimePay,OtherPay,Benefits,TotalPay,TotalPayBenefits,Year,Notes,Agency,Status 1,NATHANIEL FORD,GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY,167411.18,0.0,400184.25,,567595.43,567595.43,2011,,San Francisco, 2,GARY JIMENEZ,CAPTAIN III (POLICE DEPARTMENT),155966.02,245131.88,137811.38,,538909.28,538909.28,2011,,San Francisco, 3,ALBERT PARDINI,CAPTAIN III (POLICE DEPARTMENT),212739.13,106088.18,16452.6,,335279.91,335279.91,2011,,San Francisco, 4,CHRISTOPHER CHONG,WIRE ROPE CABLE MAINTENANCE MECHANIC,77916.0,56120.71,198306.9,,332343.61,332343.61,2011,,San Francisco,

对不起,如果很难读-如果是的话,这里是粘贴框:https://pastebin.com/raw/eCfVj1Et

2 个答案:

答案 0 :(得分:2)

使用transform的另一种解决方案:

df[df.groupby('JobTitle')['JobTitle'].transform('count').eq(1)]

答案 1 :(得分:0)

您可以在一行代码中结合value_counts()的索引值来执行此操作,其中序列等于1:

df[df['A'].isin((df['A'].value_counts() == 1).replace({False:np.nan}).dropna().index)]

也许用两行代码更好,更容易理解:

values = df['A'].value_counts()
df[df['A'].isin(values.index[values.eq(1)])]