当我尝试实现QuiesenceSearch时,我在基于negamax的AI中不断出现奇怪的行为。我基于here的伪代码:
int Quiesce( int alpha, int beta ) {
int stand_pat = Evaluate();
if( stand_pat >= beta )
return beta;
if( alpha < stand_pat )
alpha = stand_pat;
until( every_capture_has_been_examined ) {
MakeCapture();
score = -Quiesce( -beta, -alpha );
TakeBackMove();
if( score >= beta )
return beta;
if( score > alpha )
alpha = score;
}
return alpha;
}
这是我的代码:
private double QuiescenceSearch(GameBoard gameBoard, double alpha, double beta, int color)
{
double standPat = color * CalculateBoardScore(gameBoard);
if (standPat >= beta)
{
return beta;
}
else if (alpha < standPat)
{
alpha = standPat;
}
foreach (Move move in GetNoisyMoves(gameBoard))
{
gameBoard.TrustedPlay(move);
double score = -1.0 * QuiescenceSearch(gameBoard, -beta, -alpha, -color);
gameBoard.UndoLastMove();
if (score >= beta)
{
return beta;
}
else if (score > alpha)
{
alpha = score;
}
}
return alpha;
}
也就是说,AI似乎表现得如果 - 如果做出绝对最糟糕的举动(杀死它自己)就可以了。
CalculateBoardScore始终从颜色== 1侧返回,因此乘以颜色。
答案 0 :(得分:1)
我重构了我的代码,现在这可以正常运行:
private double QuiescenceSearch(GameBoard gameBoard, double alpha, double beta, int color)
{
double bestValue = color * CalculateBoardScore(gameBoard);
alpha = Math.Max(alpha, bestValue);
if (alpha >= beta)
{
return bestValue;
}
foreach (Move move in GetNoisyMoves(gameBoard))
{
gameBoard.TrustedPlay(move);
double value = -1 * QuiescenceSearch(gameBoard, -beta, -alpha, -color);
gameBoard.UndoLastMove();
bestValue = Math.Max(bestValue, value);
alpha = Math.Max(alpha, bestValue);
if (alpha >= beta)
{
break;
}
}
return bestValue;
}
伪代码的问题在于,如果它大于beta而不是beta,它应该返回stand_pat / score:
int Quiesce( int alpha, int beta ) {
int stand_pat = Evaluate();
if( stand_pat >= beta )
return stand_pat;
if( alpha < stand_pat )
alpha = stand_pat;
until( every_capture_has_been_examined ) {
MakeCapture();
score = -Quiesce( -beta, -alpha );
TakeBackMove();
if( score >= beta )
return score;
if( score > alpha )
alpha = score;
}
return alpha;
}