Question

我正在谈论的游戏类似于Gomoku或井字游戏的“大型”简化版本。基本上，您有一个8x8的木板，而获胜者是将4排连续或一列（无对角线）的木板。

我已经设置了带有alpha-beta修剪的minimax，我的问题是我不确定返回的值如何让您知道要进行哪一步。或喜欢如何将值连接到移动。

当前，我已经考虑过返回一个GameStateNode。在GameStateNode的字段中：char [] []（棋盘的当前状态），valuationVal（当它不是终端节点时的当前状态的值）。

但是我仍然想不出一种方法来使用返回的节点来决定最佳动作。

    // Alpha-Beta Pruning Search
    private static Node alphaBeta(Node initial, int depth) {
        Node bestMove = max(initial, depth, NEGATIVE_INFINITY, POSITIVE_INFINITY);
        return bestMove;
    }

    private static Node max(Node n, int depth, int alpha, int beta) {
        int value = NEGATIVE_INFINITY;
        Node currentBestMove = null;
        Node temp = null;

        // Terminal state
        if(n.fourInALine() != 0) {
            return n;
        }
        // Depth limit reached
        if(depth == 0) {
            return n;
        }

        ArrayList<Node> successors = n.generateSuccessors('X');
        // Iterate through all the successors, starting with best evaluationValues
        for(Node s : successors) {
            temp = min(s, depth - 1, alpha, beta);
            if(temp.evaluationVal > value) {
                value = temp.evaluationVal;
                currentBestMove = temp;
            }
            alpha = Math.max(alpha, value);
            if(alpha >= beta) {
                break;
            }
        }
        return currentBestMove;
    }

    // I have similar min method just with the correct comparison

Answer 1

您无法从返回的bestMove中获取移动信息，因为该节点表示depth移动之后板的位置。如果您将bestMove的位置和initial的位置进行比较，则会发现多个差异，您将无法分辨出移动顺序。

要开始使用搜索代码，请按以下步骤操作：

向boolean isRoot添加一个max()参数，以告知该方法是否直接从alphaBeta()调用，并且n是搜索树的根节点。
在max()中，如果isRoot为true，则不要跟踪temp的最佳min()（从currentBestMove返回的节点），请保留最佳s（来自n.generateSuccessors()的节点）的跟踪。
在alphaBeta()中，取bestMove（从max()返回的节点），然后将其状态数组与initial进行比较。找到bestMove有'X'而initial没有的插槽的坐标。
这就是玩法。

代码：

private static int[] alphaBeta(Node initial, int depth) {
    Node bestMove = max(initial, depth, NEGATIVE_INFINITY, POSITIVE_INFINITY, true);
    for(int i = 0; i < bestMove.state.length; i++) {
        for(int j = 0; j < bestMove.state[i].length; j++) {
            if(bestMove.state[i][j] != initial.state[i][j]) {
                return new int[] { i, j };
            }
        }
     }
}

private static Node max(Node n, int depth, int alpha, int beta, boolean isRoot) {
    int value = NEGATIVE_INFINITY;
    Node currentBestMove = null;
    Node temp = null;

    // Terminal state
    if(n.fourInALine() != 0) {
        return n;
    }

    // Depth limit reached
    if(depth == 0) {
        return n;
    }

    ArrayList<Node> successors = n.generateSuccessors('X');
    // Iterate through all the successors, starting with best evaluationValues
    for(Node s : successors) {
        temp = min(s, depth - 1, alpha, beta);
        if(temp.evaluationVal > value) {
            value = temp.evaluationVal;
            currentBestMove = isRoot ? s : temp;
        }
        alpha = Math.max(alpha, value);
        if(alpha >= beta) {
            break;
        }
    }
    return currentBestMove;
}

// I have a similar min() method with the opposite comparison,
// and without an isRoot argument.

请注意，这些都没有经过测试。

如何根据MiniMax Alpha-Beta的返回值在游戏板上移动？

1 个答案: