我是Python新手,我有一个像这样的数据集:
Date Y
12/16/2013 7:00 104052
12/16/2013 15:00 103213
12/16/2013 23:00 104724
12/17/2013 7:00 104257
12/17/2013 15:00 105565
12/17/2013 23:00 103970
12/18/2013 7:00 104026
12/18/2013 15:00 103532
12/18/2013 23:00 101313
12/19/2013 7:00 105233
12/19/2013 15:00 105864
12/19/2013 23:00 105621
12/20/2013 7:00 108011
12/20/2013 15:00 108263
12/20/2013 23:00 107320
12/21/2013 7:00 106211
12/21/2013 15:00 106315
12/21/2013 23:00 104821
12/22/2013 7:00 106312
12/22/2013 15:00 107649
12/22/2013 23:00 107690
12/23/2013 7:00 107274
12/23/2013 15:00 107298
12/23/2013 23:00 107059
如何使用上述数据找出日期和时间的趋势?
我试过这段代码:
def trend(csvfile):
filename = csvfile
data = read_csv(filename)
dateparse = lambda dates: datetime.strptime(dates, '%m/%d/%Y %H:%M')
data = read_csv(filename, parse_dates=True, index_col = 'Date', date_parser= dateparse)
ts = data['Y']
def test_stationarity(timeseries):
rolmean = rolling_mean(timeseries, window=12)
rolstd = rolling_std(timeseries, window=12)
orig = plt.plot(timeseries, color='blue', label='Original')
mean = plt.plot(rolmean, color='red', label='Rolling Mean')
std = plt.plot(rolstd, color='black', label='Rolling std')
plt.legend(loc='best')
plt.title('Rolling Mean & Standard deviation')
plt.show()
print 'Results of Dickey-Fuller Test:'
dftest = adfuller(timeseries, autolag='AIC')
dfoutput = Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
for key,value in dftest[4].items():
dfoutput['Critical Value (%s)'%key] = value
print dfoutput
test_stationarity(ts)
趋势( “Data_Analysis_Sample.csv”)