Question

有人可以帮我这个吗？我正在尝试为我的Tic-Tac-Toe游戏编程AI，所有相关搜索都让我进入了minimax算法。从我阅读和观看的所有内容中，我对算法背后的理论有了基本的了解。我遇到的问题是它在我的游戏中的实际应用。我知道这个算法本质上应该用来演出每一步并根据棋盘的状态返回一个分数。我如何让它每次都发挥不同的组合？我如何确保获得每个组合？在找到获胜状态后，如何从中返回正确的移动？我应该将每个州存储在一个数组中吗？对不起我提出的所有问题我只是想巩固我的理解，并确保我能真正实践我正在阅读的东西。我正在为游戏提供我的javascript代码，希望有人可以在这里指出我正确的方向。谢谢。

$(document).ready(function() {

    var x = 'X';
    var o = 'O';
    var newgame = function() {
        turn = x;
        sqrData = '';
        xMoves = [false,false,false,false,false,false,false,false,false,false,false,false];
        oMoves = [false,false,false,false,false,false,false,false,false,false,false,false];
        squareFree = [false,true,true,true,true,true,true,true,true,true];
        moveCount = 0;
        compPlayer = false;
        playboard = [false,[false,true,true,true],[false,true,true,true],[false,true,true,true]]
        $('div').html('');
        $('#reset').html('Reset Game');

     };
    newgame();

    $('#fir').click(function() {
       turnchange(1,1,1,$(this));
    });
    $('#sec').click(function() {
        turnchange(2,1,2,$(this));
    });
    $('#thir').click(function() {
       turnchange(3,1,3,$(this));
    });
    $('#four').click(function() {
        turnchange(4,2,1,$(this));
    });
    $('#fiv').click(function() {
       turnchange(5,2,2,$(this));
    });
    $('#six').click(function() {
        turnchange(6,2,3,$(this));
    });
    $('#sev').click(function() {
        turnchange(7,3,1,$(this));
    });
    $('#eight').click(function() {
      turnchange(8,3,2,$(this));
    });
    $('#nine').click(function() {
       turnchange(9,3,3,$(this));
    });
    var turnchange = function(playerSquare,playRow,playCol,sqrData) {
        playboard[playRow][playCol] = turn;
        console.log(playboard);
        if (squareFree[playerSquare] == true) {
            $(sqrData).html(turn);
            if (turn == x) {
                xMoves[playerSquare] = true;
                turn = o;
            }
            else if (turn == o) {
                oMoves[playerSquare] = true;
                turn = x;
            }

            squareFree[playerSquare] = false;
            moveCount++;
            checkwin($(this));
        }
    };

    var checkwin = function() {
          if ((xMoves[1] && xMoves[2] && xMoves[3]) || (xMoves[1] && xMoves[4] && xMoves[7]) ||
            (xMoves[1] && xMoves[5] && xMoves[9]) || (xMoves[2] && xMoves[5] && xMoves[8]) ||
            (xMoves[3] && xMoves[6] && xMoves[9]) || (xMoves[4] && xMoves[5] && xMoves[6]) || (xMoves[7] && xMoves[8] && xMoves[9]) ||
            (xMoves[3] && xMoves[5] && xMoves[7])) {
            $('#game').html('Game Over - X Wins');
            deactivateSquares();
        }
        else if ((oMoves[1] && oMoves[2] && oMoves[3]) || (oMoves[1] && oMoves[4] && oMoves[7]) ||
            (oMoves[1] && oMoves[5] && oMoves[9]) || (oMoves[2] && oMoves[5] && oMoves[8]) ||
            (oMoves[3] && oMoves[6] && oMoves[9]) || (oMoves[4] && oMoves[5] && oMoves[6]) || (oMoves[7] && oMoves[8] && oMoves[9]) ||
            (oMoves[3] && oMoves[5] && oMoves[7])) {
            $('#game').html('Game Over - O Wins');
            deactivateSquares();
        }
        else if (moveCount == 9) {
            $('#game').html('Its a Draw');
        }

    };
    var deactivateSquares = function() {
        for (var e in squareFree) {
            squareFree[e]= false;
        }

    };


    $('#reset').click(function(){
        newgame();
        });

});

Answer 1

首先，您需要score :: Configuration -> N功能。配置是电路板的当前状态。

我们可以绘制所有可能配置的树。叶子包含董事会的得分。 MAX是你，MIN是你的对手：

Configuration      Player
        A           MAX
    /       \
   B         C      MIN
 /   \     /   \
D,1  E,3  F,2  G,1  MAX

minmax是遍历此树的递归算法。它为给定配置和播放器计算最佳选择（基于您的score函数）。请注意，MAX的目标是最大化score，MIN的目标是最小化它。

minMax(c, player)
  if c is leaf:
    return score(c)

  if player == MAX:
    bestScore = -inf
    moves = generateAllMoves(c)
    for each move m in moves:
      c = makeMove(c, m)
      currScore = minMax(c, MIN)
      if currScore > bestScore
        bestScore = currScore
      c = undoMove(c, m)
    return bestScore

  if player == MIN:
    bestScore = +inf
    moves = generateAllMoves(c)
    for each move m in moves:
      c = makeMove(c, m)
      bestScore = minMax(c, MAX)
      if currScore < bestScore
        score = currScore
      c = undoMove(c, m)
    return bestScore

getBestMove(c):
  bestScore = -inf
  bestMove = null
  for each move m in c:
    c = makeMove(c, m)
    currScore = minMax(c, MIN)
    if currScore > bestScore
      bestScore = currScore
      bestMove = m
    c = undoMove(c, m)
  return bestMove

minMax(c, MAX)返回MIN玩家可以强制你达到的最高分，即它保证无论对手玩什么策略你都可以获得至少minMax(c, MAX)分。

如何让它每次都扮演不同的组合？

您的分数功能可以是随机的。例如：score(c) = deterministic_score(c) + rand() * 0.0001。

我如何确保获得所有组合？

您必须正确实施移动生成算法。

在找到获胜状态后如何从中返回正确的移动？

如果您的score函数返回+inf获胜状态，并且您始终选择getBestMove返回的移动，那么您将始终以获胜状态结束（提供你的对手没有针对它的反击策略，搜索的深度是无限的。）

我应该将每个州存储在数组中吗？

您可以生成所有动作并动态修改电路板。

实现minimax

1 个答案: