Question

我正在尝试使用minimax终端创建一个AI代理来玩跳棋。但是，它不会移动正确的片段。即使它们无法移动，它似乎只是随机移动。

我已经多次重写了minimax和undo函数，因为我相信问题是因为状态每次都无法正确撤消，但是我仍然遇到相同的问题。

Picturebox.Image = New Bitmap("Image Path");

这里是怎么称呼它。因为任何一块都可以移动，所以它会循环通过板上的所有位置，如果是白色块（w），则会使用板上的位置调用minimax。从minimax给出的解决方案中，它检查是否是有效的举动，如果是并且具有最佳效用得分，则选择该举动。两个循环结束后，将播放最佳动作

    def undo(self, state, oldRow, oldCol, newRow, newCol):
        if oldRow + 1 == newRow:
            if state[oldRow][oldCol] == 'b' and state[newRow][newCol] == 'B':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'b'
                state[newRow][newCol] = temp
            else:
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = state[newRow][newCol]
                state[newRow][newCol] = temp
        elif oldRow - 1 == newRow:
            if state[oldRow][newRow] == 'w' and state[newRow][newCol] == 'W':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'w'
                state[newRow][newCol] = temp
            else:
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = state[newRow][newCol]
                state[newRow][newCol] = temp
        elif oldRow + 2 == newRow:
            if state[oldRow][oldCol] == 'b' and state[newRow][newCol] == 'B':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'b'
                state[newRow][newCol] = temp
                state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
            else:
                if state[newRow][newCol] == 'b' or state[newRow][newCol] == 'B':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
                elif state[newRow][newCol] == 'W':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'b'
        elif oldRow - 2 == newRow:
            if state[oldRow][oldCol] == 'w' and state[newRow][newCol] == 'W':
                temp = state[oldRow][oldCol]
                state[oldRow][oldCol] = 'w'
                state[newRow][newCol] = temp
                state[oldRow - 1][int((oldCol + newCol) / 2)] = 'b'
            else:
                if state[newRow][newCol] == 'w' or state[newRow][newCol] == 'W':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow - 1][int((oldCol + newCol) / 2)] = 'b'
                elif state[newRow][newCol] == 'B':
                    temp = state[oldRow][oldCol]
                    state[oldRow][oldCol] = state[newRow][newCol]
                    state[newRow][newCol] = temp
                    state[oldRow + 1][int((oldCol + newCol) / 2)] = 'w'
        return state

    def minimaxAB(self, state, row, col, player, depth, alpha, beta):
        if depth == 0 or self.terminal_test(state):
            return None, self.utility(state)

        if player == HUMAN_PLAYER:  # maximizing player
            best = -math.inf
            bestRow = None
            bestCol = None
            for move in self.actions(state, row, col, player):
                newRow = move[0]
                newCol = move[1]
                _, val = self.minimaxAB(state, newRow, newCol, self.getEnemyPlayer(HUMAN_PLAYER), depth - 1, alpha, beta)

                # undo the move

                state = self.undo(state, row, col, newRow, newCol)
                if val > best:
                    bestRow, bestCol, best = newRow, newCol, val
                alpha = max(alpha, val)
                if alpha >= beta:
                    break
            next = bestRow, bestCol
            return next, best

        else:  # minimizing player
            best = math.inf
            bestRow = None
            bestCol = None
            for move in self.actions(state, row, col, player):
                newRow = move[0]
                newCol = move[1]
                _, val = self.minimaxAB(state, newRow, newCol, self.getEnemyPlayer(AI_PLAYER), depth - 1, alpha, beta)

                # undo the move
                state = self.undo(state, row, col, newRow, newCol)
                if val < best:
                    bestRow, bestCol, best = newRow, newCol, val
                beta = min(beta, val)
                if alpha >= beta:
                    break
            next = bestRow, bestCol
            return next, best

它应该做出明智而有效的举动，但同时进行两次无效举动

跳棋的minimax功能可移动两块，可能是由于撤消功能

0 个答案: