我在Jupyter笔记本中。我使用这些库:
from fastai.tabular import add_datepart
import pandas as pd
df_raw
是pd数据帧。
我遇到了一个非常奇怪的问题,当我使用第二个命令时,第一个命令将停止工作,然后使用第一个命令重新运行单元格:
第一:
>>> add_datepart(df_raw, 'saledate')
第二:
>>> df_raw.saleYear.head()
这是我得到的错误:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'saledate'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-43-6b52dab581de> in <module>()
----> 1 add_datepart(df_raw, 'saledate')
~/anaconda3/lib/python3.6/site-packages/fastai/tabular/transform.py in add_datepart(df, field_name, prefix, drop, time)
55 def add_datepart(df:DataFrame, field_name:str, prefix:str=None, drop:bool=True, time:bool=False):
56 "Helper function that adds columns relevant to a date in the column `field_name` of `df`."
---> 57 make_date(df, field_name)
58 field = df[field_name]
59 prefix = ifnone(prefix, re.sub('[Dd]ate$', '', field_name))
~/anaconda3/lib/python3.6/site-packages/fastai/tabular/transform.py in make_date(df, date_field)
10 def make_date(df:DataFrame, date_field:str):
11 "Make sure `df[field_name]` is of the right date type."
---> 12 field_dtype = df[date_field].dtype
13 if isinstance(field_dtype, pd.core.dtypes.dtypes.DatetimeTZDtype):
14 field_dtype = np.datetime64
~/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
2925 if self.columns.nlevels > 1:
2926 return self._getitem_multilevel(key)
-> 2927 indexer = self.columns.get_loc(key)
2928 if is_integer(indexer):
2929 indexer = [indexer]
~/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'saledate'
我从来没有遇到过这样的问题,也不知道是熊猫,fastai还是jupyter造成的。你能帮忙吗?
edit:我什至不确定是否只是同时使用这两个命令。现在,我没有第二个命令就遇到了错误...当我一起运行所有单元时,它将编译,但是当我使用“第一个”命令重新运行一个单元时,它就会崩溃。
答案 0 :(得分:2)
在docs中,默认情况下,add_datepart函数看起来会从原始DataFrame中删除输入列。似乎有点草率地无声地发生这种情况,但是显然您可以通过传递drop=False
来禁用该行为。
所以您的电话会
add_datepart(df_raw, 'saledate', drop=False)