我有一个pandas数据框,并根据数据框的列创建了一个字典。字典几乎生成好,但唯一的问题是我尝试过滤掉NaN值,但我的代码不起作用,因此在字典中有NaN作为键。我的代码如下:
for key,row in mr.iterrows():
# With this line I try to filter out the NaN values but it doesn't work
if pd.notnull(row['Company nameC']) and pd.notnull(row['Company nameA']) and pd.notnull(row['NEW ID']) :
newppmr[row['NEW ID']]=row['Company nameC']
输出结果为:
defaultdict(<type 'list'>, {nan: '1347 PROPERTY INS HLDGS INC', 1.0: 'AFLAC INC', 2.0: 'AGCO CORP', 3.0: 'AGL RESOURCES INC', 4.0: 'INVESCO LTD', 5.0: 'AK STEEL HOLDING CORP', 6.0: 'AMN HEALTHCARE SERVICES INC', nan: 'FOREVERGREEN WORLDWIDE CORP'
所以,我不知道如何提出nan值以及我的代码有什么问题。
修改
我的pandas数据框的一个例子是:
CUSIP Company nameA A�O NEW ID Company nameC
42020 98912M201 NaN NaN NaN ZAP
42021 989063102 NaN NaN NaN ZAP.COM CORP
42022 98919T100 NaN NaN NaN ZAZA ENERGY CORP
42023 98876R303 NaN NaN NaN ZBB ENERGY CORP
答案 0 :(得分:1)
粘贴示例 - 如何从字典中删除“nan”键:
让我们用'nan'键创建dict(数字数组中的NaN)
>>> a = float("nan")
>>> b = float("nan")
>>> d = {a: 1, b: 2, 'c': 3}
>>> d
{nan: 1, nan: 2, 'c': 3}
现在,让我们删除所有'nan'键
>>> from math import isnan
>>> c = dict((k, v) for k, v in d.items() if not (type(k) == float and isnan(k)))
>>> c
{'c': 1}
其他可行的方案。也许我错过了什么?
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: df = pd.DataFrame({'a':[1,2,3,4,np.nan],'b':[np.nan,np.nan,np.nan,5,np.nan]})
In [4]: df
Out[4]:
a b
0 1 NaN
1 2 NaN
2 3 NaN
3 4 5
4 NaN NaN
In [5]: for key, row in df.iterrows(): print pd.notnull(row['a'])
True
True
True
True
False
In [6]: for key, row in df.iterrows(): print pd.notnull(row['b'])
False
False
False
True
False
In [7]: x = {}
In [8]: for key, row in df.iterrows():
....: if pd.notnull(row['b']) and pd.notnull(row['a']):
....: x[row['b']]=row['a']
....:
In [9]: x
Out[9]: {5.0: 4.0}