KeyError:“未在轴上找到[['Date']”吗?

时间:2019-11-30 03:26:11

标签: python excel pandas

我一直在研究Python中的算法,该算法通过excel在Pandas中解析数据,并尝试删除任何缺少值的数据,基本上是其中任一列中带有 NaN 的任何行,大写。

以下是我的代码:

import numpy as np
import pandas as pd 
import math as math
import shutil as shutil

from random import seed
from random import random


randNum = int(random() * 100) 

shutil.copy('unsorted/daily/fed_debt_data.csv', 'unsorted/daily/fed_debt_data' + str(randNum) + '.csv')

debt_copy = 'unsorted/daily/fed_debt_data' + str(randNum) + '.csv'

debt_copy_read = pd.read_csv(debt_copy, names = ["Date", "Debt"])
debt_copy_read.head()

for key, value in debt_copy_read.iteritems():
    debt_copy_read.drop(key, axis = 0)

预期结果是,我删除了包含 NaN 值的列的任何行。实际结果是我在运行代码时不断出错:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-20-3083af5a3e02> in <module>
      1 for key, value in debt_copy_read.iteritems():
----> 2     debt_copy_read.drop(key, axis = 0)

~\Anaconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3938                                            index=index, columns=columns,
   3939                                            level=level, inplace=inplace,
-> 3940                                            errors=errors)
   3941 
   3942     @rewrite_axis_style_signature('mapper', [('copy', True),

~\Anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3778         for axis, labels in axes.items():
   3779             if labels is not None:
-> 3780                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3781 
   3782         if inplace:

~\Anaconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
   3810                 new_axis = axis.drop(labels, level=level, errors=errors)
   3811             else:
-> 3812                 new_axis = axis.drop(labels, errors=errors)
   3813             result = self.reindex(**{axis_name: new_axis})
   3814 

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
   4963             if errors != 'ignore':
   4964                 raise KeyError(
-> 4965                     '{} not found in axis'.format(labels[mask]))
   4966             indexer = indexer[~mask]
   4967         return self.delete(indexer)

KeyError: "['Date'] not found in axis"

我试图遍历有关美国债务的数据,一列为“ Date”变量,另一列为“ Debt”。任何关于出了什么问题/修复的建议都值得赞赏。数据组织如下:

Date,Debt
2010-02-01T14:30:00Z,12349463585067.40
2010-02-03T14:30:00Z,12354041054846.90
2010-02-05T14:30:00Z,12345510656150.00
2010-02-09T14:30:00Z,12349467132738.40
2010-02-11T14:30:00Z,12349324464284.20
2010-02-16T14:30:00Z,12384358013736.30
2010-02-17T14:30:00Z,12386495535882.20
2010-02-18T14:30:00Z,12401448666808.30

2 个答案:

答案 0 :(得分:1)

您无需遍历行即可删除具有NAN值的行。 您可以直接调用pandas.DataFrame的dropna()方法。有关更多详细信息,请参考以下网址: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html

import numpy as np
import pandas as pd 
import math as math
import shutil as shutil

from random import seed
from random import random


randNum = int(random() * 100) 

shutil.copy('unsorted/daily/fed_debt_data.csv', 'unsorted/daily/fed_debt_data' + str(randNum) + '.csv')

debt_copy = 'unsorted/daily/fed_debt_data' + str(randNum) + '.csv'

debt_copy_read = pd.read_csv(debt_copy, names = ["Date", "Debt"])
debt_copy_read.head()

debt_copy_read.dropna()

答案 1 :(得分:0)

您可以尝试:

readme.md

删除其中包含nan的行

如果熊猫将您的债务栏重新格式化,则可以使用以下方式重新格式化:

debt_copy.dropna()