我在一个由转换组成的DataFrame上写了类似Q-Learning的东西。数据包含以下列:
date, begin_time, begin_grid, end_time, end_grid, reward, [feature_columns]
我正在使用以下代码进行小批量培训:
for i in xrange(num_iter):
idx = np.random.choice(N, batch_size)
now_data = train_data.loc[idx]
predicted_value = []
for index, row in now_data.iterrows():
action_state = row[["date", "end_time", "end_grid"]]
#Can the next line be quicker?
end_state_frame = train_data[(train_data["date"] == action_state["date"]) & (train_data["begin_grid"] == action_state["end_grid"]) & (train_data["begin_time"] == action_state["end_time"])]
if len(end_state_frame) == 0:
predicted_value.append(0.0)
else:
end_pred_values = model.predict(end_state_frame[feature_columns]).flatten()
predicted_value.append(np.mean(end_pred_values))
在这段代码中,我正在重复查看“date”,“begin_time”和“begin_grid”。此代码现在太慢,无法实际训练模型。我想知道我是否可以做些什么来加快这个过程(也许是通过设置索引或分组)?
谢谢!