我是熊猫的新手,并想知道为什么我的代码没有用。我有一个数据文件,其中有一些login_time时间戳(最后有一个例子)。我将数据从json加载到DataFrame,然后尝试每30分钟聚合一次。抱怨数据类型时出错。
import json
import pandas as pd
import numpy as np
input_file = 'data.json'
df = pd.read_json(input_file)
#df['count'] = 1
df.set_index('login_time');
print df.head()
agg_30m = pd.DataFrame()
agg_30m['login_time'] = df.login_time.resample(rule='30Min', how = 'last')
agg_30m['count'] = df.login_time.resample(rule='30Min', how = 'count')
print agg_30m.head()
data.json文件中的数据样本:
{"login_time":["2010-01-01 00:12:00","2010-01-01 00:21:00","2010-01-01 00:22:00","2010-01-01 00:23:00","2010-01-01 00:24:00"]}
错误:
agg_15m['login_time'] = df.login_time.resample(rule='15Min', how = 'last')
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 4212, in resample
base=base, key=on, level=level)
File "/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.py", line 944, in resample
return tg._get_resampler(obj, kind=kind)
File "/usr/local/lib/python2.7/dist-packages/pandas/tseries/resample.py", line 1057, in _get_resampler
"but got an instance of %r" % type(ax).__name__)
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'