Question

我理解使用极小极大算法的tic-tac-toe，终端案例是胜利，失败和平局 - 10,0，-10。

当某人获胜，失败或平局时，

程序终止。由于连接四个游戏没有三个终端状态，而不是游戏结束后的记分牌，我如何从评估功能中确定其终端案例？如何实施评估功能？

int minimax(char board[3][3], int depth, bool isMax)
{
int score = evaluate(board);

// If Maximizer has won the game return his/her
// evaluated score
if (score == 10)
    return score;

// If Minimizer has won the game return his/her
// evaluated score
if (score == -10)
    return score;

// If there are no more moves and no winner then
// it is a tie
if (isMovesLeft(board)==false)
    return 0;

// If this maximizer's move
if (isMax)
{
    int best = -1000;

    // Traverse all cells
    for (int i = 0; i<3; i++)
    {
        for (int j = 0; j<3; j++)
        {
            // Check if cell is empty
            if (board[i][j]=='_')
            {
                // Make the move
                board[i][j] = player;

                // Call minimax recursively and choose
                // the maximum value
                best = max( best,
                    minimax(board, depth+1, !isMax) );

                // Undo the move
                board[i][j] = '_';
            }
        }
    }
    return best;
}

// If this minimizer's move
else
{
    int best = 1000;

    // Traverse all cells
    for (int i = 0; i<3; i++)
    {
        for (int j = 0; j<3; j++)
        {
            // Check if cell is empty
            if (board[i][j]=='_')
            {
                // Make the move
                board[i][j] = opponent;

                // Call minimax recursively and choose
                // the minimum value
                best = min(best,
                       minimax(board, depth+1, !isMax));

                // Undo the move
                board[i][j] = '_';
            }
        }
    }
    return best;
   }
}

但对于connect4，我如何计算评估函数以及如何定义终端案例（除非电路板已满）？

Answer 1

由于连接四个游戏没有三个终端状态，而不是游戏结束后的记分牌，

首先，你说连接四没有三个终端阶段是错误的。它也只能以胜利，失败或平局结束。

问题是连接四是足够复杂的，在达到那些终端阶段之前不可能评估树。这就是为什么在比最基本（如tic tac toe）更复杂的游戏中，为搜索给出预定的搜索深度，并且在该深度的末尾的所有节点被视为终端。这种深度通常由迭代深化框架中的时间约束决定。

由于实际上这些节点不是终端，我们不能再使用0,1和-1来评估它们。相反，我们扩展我们的范围，将胜利视为任意高的数字，失去任意低，并使用启发式评估函数来确定值之间。一种可能的启发式算法是玩家拥有的行数为3。有关更复杂的四种启发式方法，请阅读Victor Aliss关于该主题的论文。

maxconnect4游戏的终端案例是什么？

1 个答案: