数据集包含
timeslot Weather Location
2014-10-26 00:00 35 1
2014-10-26 06:00 36 1
2014-10-26 12:00 34 1
2014-10-26 18:00 34 1
2014-10-27 00:00 35 1
2014-10-27 06:00 36 1
2014-10-27 12:00 36 1
2014-10-27 18:00 32 1
2014-10-28 00:00 35 1
2014-10-28 06:00 33 1
2014-10-28 12:00 35 1
2014-10-28 18:00 33 1
2014-10-26 00:00 45 2
2014-10-26 06:00 46 2
2014-10-26 12:00 41 2
2014-10-26 18:00 39 2
2014-10-27 00:00 46 2
2014-10-27 06:00 44 2
2014-10-27 12:00 45 2
2014-10-27 18:00 42 2
2014-10-28 00:00 41 2
2014-10-28 06:00 40 2
2014-10-28 12:00 42 2
2014-10-28 18:00 41 2
数据集包含1年的300个位置点天气报告。我需要预测每个位置点的未来5天天气报告。为此,我尝试使用for循环训练数据。
user_list = df['Location'].unique()
list = userlist
for un in list:
Location = un
Location_data = data[data.Location==Location]
Location_data['next_weather'] = Location_data['weather'].shift(-1)
Location_data=Location_data.fillna(method='ffill')
encoded=onehotencoding(Location_data['weather'])
encoded1=onehotencoding(Location_data['next_weather'])
X_train= encoded[:-5,:]
y_train= encoded1[:-5,:]
X_train=X_train.reshape(X_train.shape[0],1,X_train.shape[1])
time_steps = 1
data_dim = X_train.shape[2]
#Lstm
model = Sequential()
model.add(LSTM(data_dim, input_shape=(time_steps,data_dim), activation='relu'))
model.add(Dense(data_dim))
model.compile(loss='mse', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=96)
model.summary()
对于此数据,可以使用最小最大标量,但我想使用一种热编码来预测未来5天的天气预报。
def onehotencoding(data):
values = array(data)
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(values)
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integerencoding(data))
return onehot_encoded
还有没有其他方法可以预测未来几天的天气。