用svr在scikit中进行时间序列预测学习

时间:2016-12-24 06:43:54

标签: python pandas scikit-learn time-series svm

我有按日期索引的每日温度数据集,我需要在scikit-learn中使用[SVR] [1]预测未来温度。

我很难选择培训的XY以及X测试 组。例如,如果我想在时间Y预测t,那么我需要。{ 训练集包含X&在Y t-1, t-2, ..., t-N N,其中Y是用于在t预测df=daily_temp1 # define function for create N lags def create_lags(df, N): for i in range(N): df['datetime' + str(i+1)] = df.datetime.shift(i+1) df['dewpoint' + str(i+1)] = df.dewpoint.shift(i+1) df['humidity' + str(i+1)] = df.humidity.shift(i+1) df['pressure' + str(i+1)] = df.pressure.shift(i+1) df['temperature' + str(i+1)] = df.temperature.shift(i+1) df['vism' + str(i+1)] = df.vism.shift(i+1) df['wind_direcd' + str(i+1)] = df.wind_direcd.shift(i+1) df['wind_speed' + str(i+1)] = df.wind_speed.shift(i+1) df['wind_direct' + str(i+1)] = df.wind_direct.shift(i+1) return df # create 10 lags df = create_lags(df,10) # the first 10 days will have missing values. can't use them. df = df.dropna() # create X and y y = df['temperature'] X = df.iloc[:, 9:] # Train on 70% of the data train_idx = int(len(df) * .7) # create train and test data X_train, y_train, X_test, y_test = X[:train_idx], y[:train_idx], X[train_idx:], y[train_idx:] # fit and predict clf = SVR() clf.fit(X_train, y_train) clf.predict(X_test) 的前几天的数量。

我该怎么做?

就是这样。

echo preg_replace('/\s+/', '-', "Vaghela Nikhil");

1 个答案:

答案 0 :(得分:1)

这是一个解决方案,它将特征矩阵X构建为简单的lag1 - lagN,其中lag1是前一天的温度,lagN是N天前的温度。

# create fake temperature
df = pd.DataFrame({'temp':np.random.rand(500)})

# define function for create N lags
def create_lags(df, N):
    for i in range(N):
        df['Lag' + str(i+1)] = df.temp.shift(i+1)
    return df

# create 10 lags
df = create_lags(df,10)

# the first 10 days will have missing values. can't use them.
df = df.dropna()

# create X and y
y = df.temp.values
X = df.iloc[:, 1:].values

# Train on 70% of the data
train_idx = int(len(df) * .7)

# create train and test data
X_train, y_train, X_test, y_test = X[:train_idx], y[:train_idx], X[train_idx:], y[:train_idx]

# fit and predict
clf = SVR()
clf.fit(X_train, y_train)

clf.predict(X_test)