我使用了很多列的cars数据框。如果我通过以下方式获得某些制造商的外观计数:
carsDF.manufacturer.value_counts()
结果类似于:
VW 2228
Opel 1414
Renault 1362
Audi 895
BMW 888
Mercedes-Benz 786
如果某制造商的外观总数少于某特定数量,如何从该数据帧中删除所有行?
答案 0 :(得分:2)
您可以制作地图:
# get the count for each manufacturer
counts = carsDF.manufacturer.value_counts()
# threshold
thresh = 1000
# replace the manufacturer with the counts and thresholding
carsDF[carsDF.manufacturer.map(counts).ge(thresh)]
答案 1 :(得分:1)
这是在价值计数结果上使用loc
来过滤制造商的方法,以筛选出超过最小计数值的制造商。
# Sample data.
df = pd.DataFrame(
{'manufacturer':
['VW'] * 2228
+ ['Opel'] * 1414
+ ['Renault'] * 1362
+ ['Audi'] * 895
+ ['BMW'] * 888
+ ['Mercedes-Benz'] * 787}
)
解决方案:
min_count = 1000
main_manufacturers = set(
df['manufacturer'].value_counts(sort=False).loc[lambda x: x >= min_count].index)
df = df.loc[df['manufacturer'].isin(main_manufacturers)]