Question

我已经使用转置表实现了alpha beta搜索。

我是否有关于在表格中存储截止值的正确想法？

具体来说，当桌面命中发生时，我的方案是返回截止点吗？（同样，存储它们。）我的实现似乎与this one冲突，但它直观地看起来对我而言。

此外，我的算法从不存储带有at_most标志的条目。我应该何时存储这些条目？

这是我的（简化）代码，展示了主要想法：

int ab(board *b, int alpha, int beta, int ply) {
    evaluation *stored = tt_get(b);
    if (entryExists(stored) && stored->depth >= ply) {
        if (stored->type == at_least) { // lower-bound
            if (stored->score >= beta) return beta;
        } else if (stored->type == at_most) { // upper bound
            if (stored->score <= alpha) return alpha;
        } else { // exact
            if (stored->score >= beta) return beta; // respect fail-hard cutoff
            if (stored->score < alpha) return alpha; // alpha cutoff
            return stored->score;
        }
    }   

    if (ply == 0) return quiesce(b, alpha, beta, ply);

    int num_children = 0;
    move chosen_move = no_move;
    move *moves = board_moves(b, &num_children);

    int localbest = NEG_INFINITY;
    for (int i = 0; i < num_children; i++) {
        apply(b, moves[i]);
        int score = -ab(b, -beta, -alpha, ply - 1);
        unapply(b, moves[i]);
        if (score >= beta) {
            tt_put(b, (evaluation){moves[i], score, at_least, ply});
            return beta; // fail-hard
        }
        if (score >= localbest) {
            localbest = score;
            chosen_move = moves[i];
            if (score > alpha) alpha = score;
        }
    }
    tt_put(b, (evaluation){chosen_move, alpha, exact, ply});
    return alpha;
}

Answer 1

我的实施似乎与此相冲突

转置表查找代码似乎对我而言。它与wikipedia上的大致相同。

// Code on Wikipedia rewritten using your notation / variable names
if (entryExists(stored) && stored->depth >= ply)
{
  if (stored->type == at_least)
    alpha = max(alpha, stored->score);
  else if (stored->type == at_most)
    beta = min(beta, stored->score);
  else if (stored->type == exact)
    return stored->score;

  if (alpha >= beta)
    return stored->score;
}

这相当于（检查if (alpha >= beta)已在每个节点类型中移动）：

if (entryExists(stored) && stored->depth >= ply)
{
  if (stored->type == at_least)
  {
    alpha = max(alpha, stored->score);
    if (alpha >= beta)  return stored->score;
  }
  else if (stored->type == at_most)
  {
    beta = min(beta, stored->score);
    if (alpha >= beta)  return stored->score;
  }
  else if (stored->type == exact)
    return stored->score;
}

可以在以下位置更改：

if (entryExists(stored) && stored->depth >= ply)
{
  if (stored->type == at_least)
  {
    // if (max(alpha, stored->score) >= beta) ...
    if (stored->score >= beta)  return stored->score;
  }
  else if (stored->type == at_most)
  {
    // if (min(beta, stored->score) <= alpha) ...
    if (stored->score <= alpha)  return stored->score;
  }
  else if (stored->type == exact)
    return stored->score;
}

剩下的区别在于维基百科使用fail-soft优化，而您的代码是经典的alpha-beta修剪（fail-hard）。 Fail-soft是一个很小的改进，但并没有改变算法的关键点。

我的算法永远不会存储带有at_most标志的条目。我什么时候应该存储这些条目？

存储exact / at_most节点类型的方法存在错误。在这里，您假设节点始终为exact类型：

tt_put(b, (evaluation){chosen_move, alpha, exact, ply});

实际上它可以是at_most节点：

if (alpha <= initial_alpha)
{
  // Here we haven't a best move.
  tt_put(b, (evaluation){no_move, initial_alpha, at_most, ply});
}
else
   tt_put(b, (evaluation){chosen_move, alpha, exact, ply});

什么时候alpha-beta搜索内存返回截止值？

1 个答案: