我目前正在努力为奥赛罗制作一个好的AI,并使用Minimax算法完成。然而,当我尝试使用alpha-beta修剪进行更深入的搜索时,似乎算法非常糟糕。我用其他来源检查了它,比如Wiki和Berkely.edu,我认为我已经正确实现了它,但我仍然找不到问题。
def alphabeta(board, player, a, b, lev):
h = heur(board, player)
if lev == 0:
return h, None
poss = get_legal_moves(board, player)
if len(poss) == 0:
return h, None
move = 0
for x in poss:
cpboard = board[:]
cpboard[x] = player
bracket(cpboard, player, x)
a1, q = alphabeta(cpboard, opponent_color(player), a, b, lev-1)
if player is me:
if a1 > a:
a, move = a1, x
else:
if a1 < b:
b, move = a1, x
if b <= a:
break
if player is me:
return a, move
else:
return b, move
答案 0 :(得分:2)
您的alpha-beta代码可能有误。请注意当玩家“转弯”时会发生什么(即没有可用的动作),由于这个原因,我的代码中有一个棘手的错误。
您是否在切换alpha和beta值时调用了递归? 我的工作方式如下(Java代码):
private float minimax(OthelloBoard board, OthelloMove best, float alpha, float beta, int depth)
{
float bestResult = -Float.MAX_VALUE;
OthelloMove garbage = new OthelloMove();
int state = board.getState();
int currentPlayer = board.getCurrentPlayer();
if (state == OthelloBoard.STATE_DRAW)
return 0.0f;
if ((state == OthelloBoard.STATE_BLACK_WINS) && (currentPlayer == OthelloBoard.BLACK))
return INFINITY;
if ((state == OthelloBoard.STATE_WHITE_WINS) && (currentPlayer == OthelloBoard.WHITE))
return INFINITY;
if ((state == OthelloBoard.STATE_BLACK_WINS) && (currentPlayer == OthelloBoard.WHITE))
return -INFINITY;
if ((state == OthelloBoard.STATE_WHITE_WINS) && (currentPlayer == OthelloBoard.BLACK))
return -INFINITY;
if (depth == maxDepth)
return OthelloHeuristics.eval(currentPlayer, board);
ArrayList<OthelloMove> moves = board.getAllMoves(currentPlayer);
for (OthelloMove mv : moves)
{
board.makeMove(mv);
alpha = - minimax(board, garbage, -beta, -alpha, depth + 1);
board.undoMove(mv);
if (beta <= alpha)
return alpha;
if (alpha > bestResult)
{
best.setFlipSquares(mv.getFlipSquares());
best.setIdx(mv.getIdx());
best.setPlayer(mv.getPlayer());
bestResult = alpha;
}
}
return bestResult;
}
电话就像:
OthelloMove bestFound = new OthelloMove();
int maxDepth = 8;
minimax(board, bestFound, -Float.MAX_VALUE, Float.MAX_VALUE, maxDepth);
//Wait for Thread to finish
board.makeMove(bestFound);
编辑:如果玩家没有可用的移动,则getAllMoves()返回'虚拟移动', 根本没有改变董事会,只是转过来。
希望它有所帮助!
答案 1 :(得分:1)
您的alphabeta实现对我来说听起来很合理。由于minimax和alphabeta在正确实现时会产生相同的结果,因此您应该能够使用旧的minimax代码作为对alphabeta的检查,至少对于适度的搜索深度。如果搜索相同的游戏树时他们的结果不同,那么你就知道你做错了什么。
但最有可能的是,糟糕的比赛是你的“heur”评价函数的结果。