如何有效地从DataFrame中选择多个列?

时间:2017-09-21 09:01:05

标签: pandas

我在一个由转换组成的DataFrame上写了类似Q-Learning的东西。数据包含以下列:

date, begin_time, begin_grid, end_time, end_grid, reward, [feature_columns]

我正在使用以下代码进行小批量培训:

for i in xrange(num_iter):

    idx = np.random.choice(N, batch_size)
    now_data = train_data.loc[idx]

    predicted_value = []
    for index, row in now_data.iterrows():
        action_state = row[["date", "end_time", "end_grid"]]
        #Can the next line be quicker?
        end_state_frame = train_data[(train_data["date"] == action_state["date"]) & (train_data["begin_grid"] == action_state["end_grid"]) & (train_data["begin_time"] == action_state["end_time"])]

        if len(end_state_frame) == 0:
            predicted_value.append(0.0)
        else:
            end_pred_values = model.predict(end_state_frame[feature_columns]).flatten()
            predicted_value.append(np.mean(end_pred_values))

在这段代码中,我正在重复查看“date”,“begin_time”和“begin_grid”。此代码现在太慢,无法实际训练模型。我想知道我是否可以做些什么来加快这个过程(也许是通过设置索引或分组)?

谢谢!

0 个答案:

没有答案