Question

我正在尝试使用Alpha-beta修剪Minimax进行3D Tic Tac Toe游戏。然而，似乎算法选择子最优路径。

例如，只需直接穿过立方体的中间或单个板，就可以赢得胜利。 AI似乎选择了 next 转弯时最佳的单元格，而不是当前转弯。

我尝试重新创建并使用启发式算法返回算法，但我没有取得多大进展。无论层面如何，它似乎都有同样的问题。

代码为here。

相关部分是computers_move和think_ahead（以及'2'变种，这些仅仅是我尝试稍微替代的方法）。

我希望这可能是我忽略的简单事情，但据我所知，我不确定问题是什么。如果有人能够解释这个问题，我会非常感激。

def computers_move2(self):
    best_score = -1000
    best_move = None
    h = None
    win = False

    for move in self.allowed_moves:
        self.move(move, self.ai)
        if self.complete:
            win = True
            break
        else:
            h = self.think_ahead2(self.human, -1000, 1000)
        self.depth_count = 0
        if h >= best_score:
            best_score = h
            best_move = move
            self.undo_move(move)
        else:
            self.undo_move(move)

    if not win:
        self.move(best_move, self.ai)
    self.human_turn = True

def think_ahead2(self, player, a, b):
    if self.depth_count <= self.difficulty:
        self.depth_count += 1
        if player == self.ai:
            h = None
            for move in self.allowed_moves:
                self.move(move, player)
                if self.complete:
                    self.undo_move(move)
                    return 1000
                else:
                    h = self.think_ahead2(self.human, a, b)
                    if h > a:
                        a = h
                        self.undo_move(move)
                    else:
                        self.undo_move(move)
                if a >= b:
                    break
            return a
        else:
            h = None
            for move in self.allowed_moves:
                self.move(move, player)
                if self.complete:
                    self.undo_move(move)
                    return -1000
                else:
                    h = self.think_ahead2(self.ai, a, b)
                    if h < b:
                        b = h
                        self.undo_move(move)
                    else:
                        self.undo_move(move)
                if a >= b:
                    break
            return b
    else:
        diff = self.check_available(self.ai) - self.check_available(self.human)
        return diff

Answer 1

原来我的算法似乎工作正常。问题是由我的帮助函数move和undo_move引起的。另外根本问题是我允许的移动。

我注意到在探索树时，computer_plays中最外圈的移动次数大幅减少。在第一次扫描期间，计算机和人类玩家每对转弯所允许的移动次数将从总数减少到27次，再减少到10次，最后减少到5次。

原来暂时测试的动作没有被替换。因此，我将该集换成标准列表，并在每次移动/撤消后对列表进行排序，并完全解决了我的问题。

3D Tic Tac Toe与Minimax＆amp; Alpha-beta修剪选择次优移动

1 个答案: