Question

我创建了一个数据帧df5：

df5 = pd.read_csv('C:/Users/Demonstrator/Downloads/Listeequipement.csv',delimiter=';', parse_dates=[0], infer_datetime_format = True)
df5['TIMESTAMP'] = pd.to_datetime(df5['TIMESTAMP'], '%d/%m/%y %H:%M')
df5['date'] = df5['TIMESTAMP'].dt.date
df5['time'] = df5['TIMESTAMP'].dt.time
date_debut = pd.to_datetime('2015-08-01 23:10:00')
date_fin = pd.to_datetime('2015-10-01 00:00:00')
df5 = df5[(df5['TIMESTAMP'] >= date_debut) & (df5['TIMESTAMP'] < date_fin)]
df5.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 8645 entries, 145 to 8789
Data columns (total 9 columns):
TIMESTAMP                 8645 non-null datetime64[ns]
ACT_TIME_AERATEUR_1_F1    8645 non-null float64
ACT_TIME_AERATEUR_1_F3    8645 non-null float64
ACT_TIME_AERATEUR_1_F5    8645 non-null float64
ACT_TIME_AERATEUR_1_F6    8645 non-null float64
ACT_TIME_AERATEUR_1_F7    8645 non-null float64
ACT_TIME_AERATEUR_1_F8    8645 non-null float64
date                      8645 non-null object
time                      8645 non-null object
dtypes: datetime64[ns](1), float64(6), object(2)
memory usage: 675.4+ KB

然后，我按天重新采样：

df5 = df5.set_index('TIMESTAMP')
df5 = df5.resample('1d').mean()
df5.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 61 entries, 2015-08-01 to 2015-09-30
Freq: D
Data columns (total 6 columns):
ACT_TIME_AERATEUR_1_F1    61 non-null float64
ACT_TIME_AERATEUR_1_F3    61 non-null float64
ACT_TIME_AERATEUR_1_F5    61 non-null float64
ACT_TIME_AERATEUR_1_F6    61 non-null float64
ACT_TIME_AERATEUR_1_F7    61 non-null float64
ACT_TIME_AERATEUR_1_F8    61 non-null float64
dtypes: float64(6)
memory usage: 3.3 KB

之后，我尝试为每个时间戳分配一个日期，时间和一周中的一天，如下所示：

df5['date'] = df5['TIMESTAMP'].dt.date
df5['time'] = df5['TIMESTAMP'].dt.time

df5['day_of_week'] = df5['date'].dt.dayofweek

days = {0:'Mon',1:'Tues',2:'Weds',3:'Thurs',4:'Fri',5:'Sat',6:'Sun'}

df5['day_of_week'] = df5['day_of_week'].apply(lambda x: days[x])

但是，当重新采样时，时间戳成为数据帧的索引，我遇到了一个问题：

KeyError                                  Traceback (most recent call last)
C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\indexes\base.py
get_loc中的
（self，key，method，tolerance） 1944年尝试： - ＆GT; 1945年返回self._engine.get_loc（键） 1946年除了KeyError：
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item
（熊猫\ hashtable.c：12368）（）
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item
（熊猫\ hashtable.c：12322）（）
KeyError: 'TIMESTAMP'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-164-9887c2fb7404> in <module>()
----> 1 df5['date'] = df5['TIMESTAMP'].dt.date
      2 df5['time'] = df5['TIMESTAMP'].dt.time
      3 
      4 df5['day_of_week'] = df5['date'].dt.dayofweek
      5 

C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\core\frame.py
getitem 中的
（自我，密钥） 1995年返回self._getitem_multilevel（key） 1996其他： - ＆GT; 1997年返回self._getitem_column（关键） 1998年 1999 def _getitem_column（self，key）：
C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\core\frame.py
_getitem_column中的
（self，key） 2002＃get column 2003年如果self.columns.is_unique： - ＆GT; 2004返回self._get_item_cache（键） 2005年 2006＃duplicate columns＆amp;可能降低维度
C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\core\generic.py
_get_item_cache中的
（self，item） 1348 res = cache.get（item） 1349如果res为None： - ＆GT; 1350个值= self._data.get（item） 1351 res = self._box_item_values（item，values） 1352 cache [item] = res
C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\core\internals.py
in get（self，item，fastpath） 3288 3289如果不是isnull（item）： - ＆GT; 3290 loc = self.items.get_loc（item） 3291其他： 3292 indexer = np.arange（len（self.items））[isnull（self.items）]
C:\Users\Demonstrator\Anaconda3\lib\site-packages\pandas\indexes\base.py
get_loc中的
（self，key，method，tolerance） 1945年返回self._engine.get_loc（键） 1946年除了KeyError： - ＆GT; 1947年返回self._engine.get_loc（self._maybe_cast_indexer（key）） 1948年 1949年indexer = self.get_indexer（[key]，method = method，tolerance = tolerance）
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item
（熊猫\ hashtable.c：12368）（）
pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item
（熊猫\ hashtable.c：12322）（）
KeyError: 'TIMESTAMP'

请您解决此问题？提前谢谢

亲切的问候

Answer 1

即使您将列指定为索引，也可以将该列保留在数据框中：

df5 = df5.set_index('TIMESTAMP', drop=False)

重新采样数据帧后索引列未定义

1 个答案: