当我尝试使用df.index.get_loc(recession_start)时,会返回密钥错误

时间:2019-09-18 06:18:13

标签: python pandas jupyter-notebook

我试图通过观察df列中连续值的趋势来寻找衰退何时结束。我想回到GDP连续两个季度上升的季度。

我已经通过以下功能查明了经济衰退的开始时间:

def get_recession_start():
    df = get_data()
    for i in range(1, len(df) - 1):
        if (df.iloc[i]['GDP'] < df.iloc[i - 1]['GDP']) and (df.iloc[i + 1]['GDP'] < df.iloc[i]['GDP']):
            return df['Yearly quarters'].iloc[i]


get_recession_start()

我想使用此函数查找上述函数的精确索引(get_recession_start()),以从该点搜索df。

def get_recession_end():
    import pandas as pd
    import numpy as np

    df = get_data()
    recession_start = get_recession_start()
    index = df.index.get_loc(recession_start)
    for i in range(index + 2, len(df)):
        if (df.iloc[i]['GDP'] > df.iloc[i-1]['GDP']) and (df.iloc[i - 
1]['GDP'] > df.iloc[i - 2]['GDP']):
            return df['Yearly quarters'].iloc[i]

get_recession_end()

I would expect this function to return the single string value '2008q3', however, instead I'm getting a traceback message:


KeyError                                  Traceback (most recent call last)
//anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: '2008q3'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-16-5f3688935fbb> in <module>
     11             return df['Yearly quarters'].iloc[i]
     12 
---> 13 get_recession_end()

<ipython-input-16-5f3688935fbb> in get_recession_end()
      6     df = get_data()
      7     recession_start = get_recession_start()
----> 8     index = df.index.get_loc(recession_start)
      9     for i in range(index + 2, len(df)):
     10         if (df.iloc[i]['GDP'] > df.iloc[i-1]['GDP']) and (df.iloc[i - 1]['GDP'] > df.iloc[i - 2]['GDP']):

//anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2657                 return self._engine.get_loc(key)
   2658             except KeyError:
-> 2659                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2660         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2661         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: '2008q3'

1 个答案:

答案 0 :(得分:0)

请注意, get_recession_start 返回 Early Quarters 列中的值 (从四分之一行开始衰退)。

您在recession_start = get_recession_start()中调用此函数,因此 recession_start 包含衰退开始的季度(类似于 yyyy q n )。

然后您的代码包含index = df.index.get_loc(recession_start),因此您调用 get_loc ,传递刚刚找到的四分之一“名称”。

这就是错误的根源,因为 get_loc 在索引中查找 (可能是数字序列)仅用于此传递的值(可能是 2008q3 )。

显然,此索引不包含传递的值,因此异常 引发的只是 KeyError:'2008q3'

要更正您的程序,请执行以下操作:

  • 为此行找到键值
  • 将其传递给 get_loc