Question

我正在尝试使用具有一些NaN值的Pandas DataFrame。当我尝试

时

df.fillna(df.mean())

我收到以下错误，无法找到解决方案或原因：错误：

TypeError: cannot label index with a null key

所有列都是int或float。我甚至可以将单个列提取到数组中，在此数组上执行fillna（）并重新集成到DataFrame中。

任何想法或提示？非常感谢你！

我的代码：

test=pd.read_csv("../input/test.csv")
test.fillna(test.mean(),inplace=True)

我正在处理的文件来自Kaggle test或train.csv。我对这两个数据都有相同的错误： https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data

错误代码如下：

TypeError                                 Traceback (most recent call last)
<ipython-input-29-ab3e419316e1> in <module>()
     14 
     15 #Also test has NaN's
---> 16 test.fillna(test.mean(),inplace=True)

/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in fillna(self, value, method, axis, inplace, limit, downcast, **kwargs)
   2752                      self).fillna(value=value, method=method, axis=axis,
   2753                                   inplace=inplace, limit=limit,
-> 2754                                   downcast=downcast, **kwargs)
   2755 
   2756     @Appender(_shared_docs['shift'] % _shared_doc_kwargs)

/opt/conda/lib/python3.6/site-packages/pandas/core/generic.py in fillna(self, value, method, axis, inplace, limit, downcast)
   3645                     if k not in result:
   3646                         continue
-> 3647                     obj = result[k]
   3648                     obj.fillna(v, limit=limit, inplace=True, downcast=downcast)
   3649                 return result

/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
   1962             return self._getitem_multilevel(key)
   1963         else:
-> 1964             return self._getitem_column(key)
   1965 
   1966     def _getitem_column(self, key):

/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
   1972 
   1973         # duplicate columns & possible reduce dimensionality
-> 1974         result = self._constructor(self._data.get(key))
   1975         if result.columns.is_unique:
   1976             result = result[key]

/opt/conda/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
   3603 
   3604             if isnull(item):
-> 3605                 raise TypeError("cannot label index with a null key")
   3606 
   3607             indexer = self.items.get_indexer_for([item])

TypeError: cannot label index with a null key


The error message is as follows:

Answer 1

以下示例似乎很有效：

    x_1       x_2       x_3       x_4
0  0.000000  0.000000  0.000000  0.000000
1  1.000000  1.000000  1.000000  1.000000
2  2.000000  1.166667  2.000000  2.000000
3  3.000000  3.000000  3.000000  3.000000
4  0.000000  0.000000  0.000000  0.000000
5  1.000000  1.000000  1.000000  1.333333
6  2.000000  2.000000  2.000000  2.000000
7  1.285714  1.166667  1.285714  1.333333

制造

{{1}}

深入了解您的输入数据。

Answer 2

您可以尝试：

df['your_column'] = df['your_column'].fillna((df['your_column'].mean()))

通过这种方式，您可以使用其自己的列的平均值填充NaN值。

Panda - Fillna - TypeError：无法使用null键标记索引

2 个答案: