如何使用python找出数据集的趋势。趋势在时间和日期方面是增加还是减少?

时间:2016-11-15 07:07:05

标签: python pandas numpy trend

我是Python新手,我有一个像这样的数据集:

Date    Y
12/16/2013 7:00 104052
12/16/2013 15:00    103213
12/16/2013 23:00    104724
12/17/2013 7:00 104257
12/17/2013 15:00    105565
12/17/2013 23:00    103970
12/18/2013 7:00 104026
12/18/2013 15:00    103532
12/18/2013 23:00    101313
12/19/2013 7:00 105233
12/19/2013 15:00    105864
12/19/2013 23:00    105621
12/20/2013 7:00 108011
12/20/2013 15:00    108263
12/20/2013 23:00    107320
12/21/2013 7:00 106211
12/21/2013 15:00    106315
12/21/2013 23:00    104821
12/22/2013 7:00 106312
12/22/2013 15:00    107649
12/22/2013 23:00    107690
12/23/2013 7:00 107274
12/23/2013 15:00    107298
12/23/2013 23:00    107059

如何使用上述数据找出日期和时间的趋势?

我试过这段代码:

def trend(csvfile):
filename = csvfile

data = read_csv(filename)

dateparse = lambda dates: datetime.strptime(dates, '%m/%d/%Y %H:%M')
data = read_csv(filename, parse_dates=True, index_col = 'Date', date_parser= dateparse)

ts = data['Y']

def test_stationarity(timeseries):

    rolmean = rolling_mean(timeseries, window=12)
    rolstd = rolling_std(timeseries, window=12)

    orig = plt.plot(timeseries, color='blue', label='Original')
    mean = plt.plot(rolmean, color='red', label='Rolling Mean')
    std = plt.plot(rolstd, color='black', label='Rolling std')
    plt.legend(loc='best')
    plt.title('Rolling Mean & Standard deviation')
    plt.show()


    print 'Results of Dickey-Fuller Test:'
    dftest = adfuller(timeseries, autolag='AIC')
    dfoutput = Series(dftest[0:4], index=['Test Statistic','p-value','#Lags Used','Number of Observations Used'])
    for key,value in dftest[4].items():
        dfoutput['Critical Value (%s)'%key] = value
    print dfoutput


test_stationarity(ts)

趋势( “Data_Analysis_Sample.csv”)

0 个答案:

没有答案