我已经实现了MiniMax算法(带有alpha-beta修剪),但是它的表现方式很有趣。我的玩家会创造巨大的领先优势,但是当需要进入决赛时,获胜的举动并不会成功,只会不断拖累游戏。
这是我的minimax函数:
// Game states are represented by Node objects (holds the move and the board in that state)
//ValueStep is just a pair holding the minimax value and a game move (step)
private ValueStep minimax(Node gameState,int depth,int alpha,int beta) {
//Node.MAXDEPTH is a constant
if(depth == Node.MAXDEPTH || gameOver(gameState.board)) {
return new ValueStep(gameState.heuristicValue(),gameState.step);
}
//this method definately works. child nodes are created with a move and an
//updated board and MAX value
//which determines if they are the maximizing or minimizing players game states.
gameState.children = gameState.findPossibleStates();
if(state.MAX) { //maximizing player
ValueStep best = null;
for(Node child: gameState.children) {
ValueStep vs = new ValueStep(minimax(child,depth+1,alpha,beta).value,child.move);
//values updated here if needed
if(best==null || vs.value > best.value) best = vs;
if(vs.value > alpha) alpha = vs.value;
if(alpha >= beta) break;
}
return best;
} else { //minimizing player
ValueStep best = null;
for(Node child: gameState.children) {
ValueStep vs = new ValueStep(minimax(child,depth+1,alfa,beta).value,child.move);
if(best==null || vs.value < best.value) best = vs;
if(vs.value < beta) beta = vs.value;
if(alpha >= beta) break;
}
return best;
}
}
首先,我认为问题出在我的评估功能上,但如果是,我找不到它。在这个游戏中,两个玩家都有得分,而我的功能只是根据得分差异计算启发式值。 在这里:
public int heuristicValue() {
//I calculate the score difference here in this state and save it in
//the variable scoreDiff. scoreDiff will be positive if I am winning
//here, negative if im loosing.
//"this" is a Node object here. If the game is over here, special
//heuristic values are returned, depending on who wins (or if its a
//draw)
if(gameOver(this.board)) {
if(scoreDiff>0) {
return Integer.MAX_VALUE;
} else if(scoreDiff==0) {
return 0;
} else {
return Integer.MIN_VALUE;
}
}
int value = 0;
value += 100*scoreDiff; //caluclate the heuristic value using the score differerence. If its high, the value will be high as well
return value;
}
我已将我的代码“翻译”为英语,因此可能会有错别字。我非常确定问题出在这里,但是如果您需要其他代码,那么我将更新问题。同样,我的玩家可以创造优势,但是由于某种原因,它不会使最终获胜的举动。 感谢您的帮助!
答案 0 :(得分:3)
假设您的Minimax玩家处于可以证明自己可以保证获胜的位置。通常仍然可以通过许多不同的方式来保证最终的胜利。有些举动可能是即时获胜,有些举动可能会不必要地拖累游戏……只要这不是一个愚蠢的举动突然让对手获胜(或平局),它们都是胜利,而且都具有相同的优势博弈论价值(代码中的import {Subject} from 'rxjs'
export default class EventObserver {
constructor() {
this.subject = new Subject()
}
addToStream(event) {
this.subject.onNext(event)
}
buildColumnState = (event) => {
// resolve event and return an object
return {m: 'ok'}
}
getObservableChanges() {
return this.subject.pipe(
switchMap(this.buildColumnState)
)
}
}
// file.js
import EventListener from './listners/EventObserver'
// in side class
constructor() {
this.eventListner = new EventListener()
this.listenForEvents()
}
// then
// add some events to stream
this.eventListner.addToStream(this)
// listen for events
listenForEvents(){
this.eventListner
.getObservableChanges()
._trySubscribe((e) => {
console.log(e)
})
}
)。
您的Minimax算法不会区分这些移动,而只是播放恰好是Integer.MAX_VALUE
列表中第一移动的移动。那可能是一个快速的,微弱的胜利,或者可能是一个缓慢的,非常深的胜利。
有两种简单的方法可以使您的Minimax算法优先考虑快赢而不是慢赢:
gameState.children
函数以合并搜索深度。例如,您可以在获胜位置返回heuristicValue()
。实际上,这将使更快的胜利获得更大的评价。