Question

请帮助

我尝试过重新取样数据失败了。它会生成上述错误，当应用DatetimeIndex 时，它会截断时间戳，删除HH：MM：SS 。它仍然无法将数据识别为Datetime对象。提前谢谢。

可以找到源文件here

import pandas as pd
import numpy as np

df= pd.read_csv('20170713.csv')
df2= df.loc[:,['sen_id', 'pos_id', 'heat_val', 'sat_val', 'timestamp']] 
cols = df2.columns.tolist() 
cols = cols[-1:] + cols[:-1]
df2 = df2[cols]
#print(df2.head())

df3 = df2.set_index(['timestamp'])
df3.index = pd.DatetimeIndex(df3.index)
print(df3.head())

pd.to_datetime(df3[['year', 'month', 'day']])
df3.resample('1H').mean()
print(df3)

Answer 1

问题在于pd.to_datetime()的使用不正确，您提供了df3中不存在的df3[['year','month','day']]三列。相反，您只想提供series。然后你想提供参数format='%d/%m/%Y %H:%M'，它对应于你的日期strptime格式

df= pd.read_csv('20170713.csv')
df2= df.loc[:,['sen_id', 'pos_id', 'heat_val', 'sat_val', 'timestamp']] 
cols = df2.columns.tolist() 
cols = cols[-1:] + cols[:-1]
df2 = df2[cols]
#print(df2.head())

df3 = df2.set_index(['timestamp'])
#df3.index = pd.DatetimeIndex(df3.index)
#print(df3.head())

#pd.to_datetime(df3[['year', 'month', 'day']])
df3.index = pd.to_datetime(df3.index,format='%d/%m/%Y %H:%M')
df3 = df3.resample('1H').mean()
print(df3)

例如，为了便于阅读，您的代码实际上也可以缩小，

df = pd.read_csv('20170713.csv')

#Preserve desired columns and reorder as df2
df2 = df[['timestamp', 'sen_id', 'pos_id', 'heat_val', 'sat_val']]

#set timestamp as index and convert to datetime
df2.set_index(['timestamp'],drop=True,inplace=True)
df2.index = pd.to_datetime(df2.index,format='%d/%m/%Y %H:%M')

#resample
df3 = df2.resample('1H').mean()

print df3

熊猫时间序列重新采样：KeyError：＆＃34; [＆＃39; year＆＃39; ＆＃39;当月＆＃39; ＆＃39; day＆＃39;]不在索引＆＃34;

1 个答案: