我有一个如下所示的数据框:
Station A B C
Date
2013-01-31 1340381 1568766 910785
2013-02-28 1261806 1447467 843956
2013-03-31 1399123 1579597 926968
2013-04-30 1395016 1618159 950947
2013-05-31 1340408 1654265 988293
每个月每个车站的总人数。我如何计算2013年哪一个电台增长最快(旅客人数增加)?
答案 0 :(得分:0)
您可以进行线性回归并估算一年内的增长率。当然,由于您没有一年的数据,因此您会对季节性变化产生偏见。
import numpy as np
import pandas as pd
from scipy import stats
df = pd.DataFrame()
df['A'] = np.linspace(1000, 2000, 6)
df['B'] = np.linspace(1000, 3000, 6)
df['C'] = np.linspace(1000, 4000, 6)
df.index = pd.date_range('1/1/2015', periods=6, freq='M', name='Date')
print df
t = (pd.to_datetime(df.index.values) - pd.to_datetime('2015-01-01')).astype('timedelta64[D]')
slope, intercept, r_value, p_value, std_err = stats.linregress(t, df['A'])
print ('\nEstimated growth in one year of line A: ' + str(slope*365))
slope, intercept, r_value, p_value, std_err = stats.linregress(t, df['B'])
print ('\nEstimated growth in one year of line B: ' + str(slope*365))
slope, intercept, r_value, p_value, std_err = stats.linregress(t, df['C'])
print ('\nEstimated growth in one year of line C: ' + str(slope*365))