有人可以帮助我了解如何使用alpha-beta修剪算法吗?我制作的游戏类似于连接四个游戏。唯一的区别是没有对角线胜利,并且玩家可以在任何给定时间标记一个正方形(当然,除非它已被占用)。我想我理解如何编写算法代码,我只是觉得我错了。我一直在做的是有一个看起来像这样的for循环
for(i=0; i<size; i++)
for(j=0; j<size; j++)
val = alphabeta();
if(val > max)
max = val;
move = set(i,j);
setBoard(move); //sets the to the returned value from alphabeta()
我遇到的问题是第一次运行alphabeta会返回最大值,因此下一个值都不会更大,并且电路板只会设置在电路板[0] [0]上。有谁知道我做错了什么?
public int alphabeta(Placement place, int depth, int alpha, int beta, boolean maxPlayer)
{
Placement p = null;
if(depth==0 || board.isWinner())
{
return evaluate(place, maxPlayer);
}
if(maxPlayer)
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j, opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = opponent;
p = new Placement(i, j);
alpha = Math.max(alpha, alphabeta(p, depth-1, alpha, beta, false));
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return alpha;
}
else
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j,opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = player;
p = new Placement(i, j);
beta = Math.min(beta, alphabeta(p, depth-1, alpha, beta, true));
System.out.println(board);
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return beta;
}
}
这是移动的功能
public void makeMove()
{
int max = -1;
Placement p = null;
int val = -1;
for(int i=0; i<size; i++)
for(int j=0; j<size; j++)
{
if(board.validMove(i, j))
{
if(board.canGetFour(i, j, opponent)||(board.canGetFour(i,j,player)&&board.canGetFour(i,j,opponent)))
{
board.board[i][j] = player;
val = alphabeta(new Placement(i,j), 5, -5000, 5000, true);
board.board[i][j] = 0;
if(val > max)
{
max = val;
p = new Placement(i, j);
}
}
}
}
board.board[p.row][p.col] = player;
board.moves++;
}
所以,这是我的更新代码,仍无效
public Placement alphabeta(Placement p)
{
int v = max(p,6,-500000, 500000);
return successors(v);
}
public int max(Placement p, int depth, int alpha, int beta)
{
if(depth == 0 || board.isWinner())
{
return evaluateMax(p,player);
}
int v = -500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = player;
v = Math.max(v, min(place, depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v>= beta)
return v;
alpha = Math.max(alpha, v);
}
return v;
}
public int min(Placement p, int depth, int alpha, int beta)
{
if(depth == 0||board.isWinner())
{
return evaluateMax(p,opponent);
}
int v = 500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = opponent;
v = Math.min(v, max(place,depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v<= alpha)
return v;
beta = Math.min(alpha, v);
}
return v;
}
public void makeMove()
{
Placement p = null;
for(int i=0; i<successors.size(); i++)
{
Placement temp = successors.get(i);
//board.board[temp.row][temp.col] = player;
p = alphabeta(temp);
//board.board[temp.row][temp.col] = 0;
}
System.out.println("My move is "+p.row + p.col);
board.board[p.row][p.col] = player;
successors.remove(p);
}
我稍微改变了算法,所以我可以清楚地看到min和max发生了什么,但是,它仍然无法正常播放
答案 0 :(得分:0)
好的,花了一些时间,但我想我有。
在你的评估函数中,你应该返回状态对于实际玩家的好坏程度。如果&#34; otherPlayer&#34;的展示位置是canGetFour
,则表示状态不佳(状态最差)。所以你返回一个小数字。但是,如果展示位置是&#34; actualPlayer&#34;的canGetFour
你返回一个很大的数字(这是一个很好的状态)。
然后在你的makeMove中,你只是检查状态是否是最好的状态。注意,为此使用2d数组只是存储&#34;子节点的最低效方式&#34;。有一个placement.getPossibleMoves()会返回一个包含所有空方块(真实和临时)的数组,然后迭代它。否则,您的算法将按照电路板大小的顺序成为指数时间。
private Placement bestNext;
private List<Placement> tempMoves = new ArrayList<>();
private int alpha;
private int beta;
public int alphabeta(Placement place, int depth, boolean maxPlayer)
{
Placement p = null;
if(depth == maxDepth){/* (unnasigned squares in actual board) */
return evaluate(place, maxPlayer)
}
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)){
p = new Placement(i, j);
tempMoves.add(placement);
int tmp = Math.max(alpha, alphabeta(p, depth += 1, actualPlayer.getOpponent()));
if(maxPlayer){
alpha = tmp
}
else{
beta = tmp
}
tempMoves.remove(placement);
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return maxPlayer ? alpha : beta;
}