Question

我目前正在针对c＃中的AI进行跳棋游戏。我尝试使用minimax算法实现AI。尽管我的功能可以执行，但它选择的动作根本不合逻辑。我用很多方法测试了它，算法在有很多更好的选择时只是选择了不好的动作。我不认为这是由于水平问题造成的，因为它的移动会立即产生后果，例如丢失一块而不捕获对手的任何一块。 Som关于代码的注释：

我的函数采用8x8 2d枚举数组Pieces来表示棋盘。
BlackPlayer是同一类具有函数的布尔值。
MyPiece(currentPiece)函数检查currentPiece是否与AI颜色相同。
由于在检查功能中必须执行捕获，因此首先会检查gameState是否包含任何捕获动作。如果没有，则检查正常动作。
我使用alpha-beta修剪来提高效率。

我使用了CloneGameState(gameState)函数来复制2d数组，以便表示游戏的原始数组永远不会改变。

public int Minimax (Pieces[,] gameState, int depth, bool is_maximizing, int alpha, int beta)
{
    //Base Case - Return the board value 
    if (depth == 3)
        return HeuristicEvaluation(gameState);

    Move[] possibleMoves;
    int bestValue;
    bool currentSide;

    if (is_maximizing)
    {
        bestValue = int.MinValue;
        currentSide = BlackPlayer;
    }
    else
    {
        bestValue = int.MaxValue;
        currentSide = !BlackPlayer;
    }

    // check forced moves
    int moveCount = rules.GetCaptureMoves(gameState,out possibleMoves, currentSide);
    // if no forced moves get normal moves 
    if (moveCount < 1)
        moveCount = rules.GetNormalMoves(gameState,out possibleMoves, currentSide);

    // traverse moves
    for (int i = 0; i < moveCount; i++)
    {
        Pieces[,] newGameState = ApplyMove(CloneGameState(gameState), possibleMoves[i]);
        int newStateValue = Minimax(newGameState, depth + 1, !is_maximizing,alpha, beta);

        if (is_maximizing)
        {
            if (newStateValue > bestValue)
            {
                bestValue = newStateValue;
                if (depth == 0)
                    bestMove = possibleMoves[i];
                if (newStateValue > alpha)
                    alpha = newStateValue;
                if (alpha >= beta)
                    return bestValue;
            }
        }
        //Evaluation for min
        else
        {
            if (newStateValue < bestValue)
            {
                bestValue = newStateValue;
                if (newStateValue < beta)
                    beta = newStateValue;
                if (alpha >= beta)
                    return bestValue;
            }
        }
    }
    return bestValue;
}

启发式功能：

public int HeuristicEvaluation(Pieces[,] gameState)
{
    int stateValue = 0;

    //use loops to check each piece 
    for (int j = 0; j < 8; j++)
    {
        int i = 0;
        if (j % 2 == 1)
            i++;

        for (; i < 8; i += 2)
        {
            Pieces currentPiece = gameState[i, j];

            if (currentPiece != Pieces.empty)
            {

                // if the current piece is mine
                if (MyPiece(currentPiece))
                {
                    // check if my piece is a king
                    if (currentPiece == Pieces.whiteKing || currentPiece == Pieces.blackKing)
                        stateValue += 80;
                    // my piece is a man
                    else
                    {
                        stateValue += 30;
                        // row values, closer to king zone higher the value 
                        if (currentPiece == Pieces.blackMan)
                        {
                            // black goes in reverse direction
                             int y = 7-j;
                             stateValue += y;
                        }
                        else
                             stateValue += j; 
                    }
                    // pieces on the edge are safe from capture
                    if (i == 0 ||i == 7 || j== 0 ||j ==7)
                    {
                        stateValue += 10;
                    }

                }

                // point reduction for enemy pieces
                else
                {
                    if (currentPiece == Pieces.whiteKing || currentPiece == Pieces.blackKing)
                        stateValue -= 80;
                    else
                    {
                        stateValue -= 20;

                        // row values, closer to king zone higher the value 
                        if (currentPiece == Pieces.blackMan )
                        {
                            // black goes in reverse direction
                            int y = 7-j;
                            stateValue -= y;
                        }
                        else
                            stateValue -= j;
                    }
                    // pieces on the edge cant be captured
                    if (i == 0 || i == 7 || j == 0 || j == 7)
                    {
                        stateValue -= 10;
                    }
                }
            }
        }
    }
    return stateValue;
}

Answer 1

首先，我要指出的是，您的函数Maximizer和Minimizer可以组合在一个函数Minimax(Pieces, gameState, depth, bool is_maximizing)中，因为除了几行代码外，它们的逻辑几乎相同。因此，不用调用Maximizer，而是将is_maximizing设置为true的Minimax调用。而不是调用Minimizer，只需调用is_maximizing设置为false的Minimax。这将有助于避免重复，并使您的代码更具可读性。

第一点使我们在算法上犯了一个错误。在Minimize函数中，您可以递归调用自身，而应调用Maximize函数。

另一点是您处理给定位置中所有有效移动的方式。您不必将捕获移动的处理与非捕获移动的处理分开。原因再次是，处理两种类型的移动的逻辑都是相同的。我建议创建两个函数-GenerateValidMoves（）和SortValidMoves（）。 GenerateValidMoves（）函数将生成给定位置的所有有效移动的列表。生成动作列表后，调用SortValidMoves（）对列表进行排序，以使捕获的动作位于列表的开头，然后是非捕获的动作。

这是minimax的简化伪代码：

Minimax(color, board, depth, is_max):
    if ((depth == DEPTH_CUTOFF) or IsTerminalNode()):
        return EvalBoard()
    best_score = is_max ? -infinity : infinity
    valid_moves = GenerateValidMoves(board, color)
    for curr_move in valid_moves:
        clone_board = board.clone()
        clone_board.make_move(curr_move)
        int curr_score = Minimax(opposite_color, clone_board, depth + 1, !is_max)
        if (is_max) {
            if (curr_score > best_score) {
                best_score = curr_score
                best_move = curr_move
            }
        } else {
            if (curr_score < best_score) {
                best_score = curr_score
                best_move = curr_move
            }
        }
    return best_score

Minimax算法无法按预期工作

1 个答案: