我正在尝试使用此代码计算NaN在数据框列中出现的次数:
count = enron_df.loc['salary'].count('NaN')
但每次我运行时都会出现以下错误:
KeyError: 'Level NaN must be same as name (None)'
我在网上搜索了很多试图寻找解决方案,但无济于事。
答案 0 :(得分:13)
如果NaN
是missing values:
enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
print (enron_df)
salary
0 NaN
1 NaN
2 1.0
3 5.0
4 7.0
count = enron_df['salary'].isna().sum()
#alternative
#count = enron_df['salary'].isnull().sum()
print (count)
2
如果NaN
是strings
:
enron_df = pd.DataFrame({'salary':['NaN', 'NaN', 1, 5, 'NaN']})
print (enron_df)
salary
0 NaN
1 NaN
2 1
3 5
4 NaN
count = enron_df['salary'].eq('NaN').sum()
#alternative
#count = (enron_df['salary'] == 'NaN').sum()
print (count)
3
答案 1 :(得分:5)
根据定义,count
省略NaN
s而size
则省略count = enron_df['salary'].size - enron_df['salary'].count()
。
因此,应该做一个简单的区别
{
"crypto": {
"BTC": {
"name": "bitcoin",
"current_price": "$6592.3"
}
}
}
答案 2 :(得分:3)
试试这样:
count = df.loc[df['salary']=='NaN'].shape[0]
或者更好:
count = df.loc[df['salary']=='NaN', 'salary'].size
而且,走在你的道路上,你需要这样的事情:
count = df.loc[:, 'salary'].str.count('NaN').sum()
答案 3 :(得分:3)
还有dropna
参数
import numpy as np
import pandas as pd
enron_df = pd.DataFrame({'salary':[np.nan, np.nan, 1, 5, 7]})
enron_df.salary.value_counts(dropna=False)
#NaN 2
# 7.0 1
# 5.0 1
# 1.0 1
#Name: salary, dtype: int64
如果您只想要数字,只需从值计数中选择np.NaN
即可。 (如果它们是字符串'NaN'
,则只需将np.NaN
替换为'NaN'
)
enron_df.salary.value_counts(dropna=False)[np.NaN]
#2