我有两列。
Sales Close_Date
0 04/01/12
0
33496 12/01/12
588 05/01/12
9240 10/01/12
如何找出“0”或“9296”的数量或“销售”栏中的任何其他值?
答案 0 :(得分:1)
如果需要计算一个值,则最简单的是布尔掩码的总和True
值:
print (df.Sales == 0)
0 True
1 True
2 False
3 False
4 False
Name: Sales, dtype: bool
a = (df.Sales == 0).sum()
print (a)
2
如果需要计算所有值需要groupby
并汇总size
或使用value_counts
:
df = df.groupby('Sales').size()
print (df)
Sales
0 2
588 1
9240 1
33496 1
dtype: int64
或者:
df = df['Sales'].value_counts()
print (df)
0 2
9240 1
588 1
33496 1
Name: Sales, dtype: int64
如果需要过滤器,请使用query
或boolean indexing
:
df = df.query('Sales == 0')
print (df)
Sales Close_Date
0 0 04/01/12
1 0 NaN
或者:
df = df[df.Sales == 0]
print (df)
Sales Close_Date
0 0 04/01/12
1 0 NaN
<强>计时强>:
#[500000 rows x 2 columns]
df = pd.concat([df]*100000).reset_index(drop=True)
print (df)
In [37]: %timeit ((df.Sales == 0).sum())
The slowest run took 4.18 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 4.62 ms per loop
In [38]: %timeit (Counter(df.Sales)[0])
10 loops, best of 3: 82.4 ms per loop
但这可以更快:
a = (df.Sales.value == 0).sum()
答案 1 :(得分:1)
from collections import Counter
c = Counter(df.Sales)
c[0]
2