使用TicTacToe的Minimax算法无法正常工作

时间:2016-07-27 15:18:51

标签: c artificial-intelligence tic-tac-toe minimax

我已经在这个论坛上发布了一个类似的问题,但由于旧帖子有点长,我重写了我的算法,我正在开始这个新帖子。 可以找到旧帖子here

所以我只是试图为我的TicTacToe游戏实现一个minimax算法,除非它变得相当困难,即使经过几天试图找到错误,我也找不到它。你可以在下面找到我的代码。首先,我有一些定义,typedef和声明:

typedef signed char s8;
typedef unsigned char u8;
typedef s8 score;

#define STATE_00    getBoardState(0, 0)
#define STATE_01    getBoardState(0, 1)
#define STATE_02    getBoardState(0, 2)
#define STATE_10    getBoardState(1, 0)
#define STATE_11    getBoardState(1, 1)
#define STATE_12    getBoardState(1, 2)
#define STATE_20    getBoardState(2, 0)
#define STATE_21    getBoardState(2, 1)
#define STATE_22    getBoardState(2, 2)

typedef enum {
    EPlayerX = 1,
    EPlayerO
} EPlayer;

typedef enum {
    EMinimizing = 0,
    EMaximizing
} EMinMax;

static u8 g_boardState[3][3] = {
    {0, 0, 0,},
    {0, 0, 0,},
    {0, 0, 0,},
};

其后是一些功能:

u8 getBoardState(u8 row, u8 column);

EPlayer isWon(void)
{
    EPlayer winningBoards[8][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
        {STATE_00, STATE_10, STATE_20},
        {STATE_01, STATE_11, STATE_21},
        {STATE_02, STATE_12, STATE_22},
        {STATE_00, STATE_11, STATE_22},
        {STATE_20, STATE_11, STATE_02},
    };
    u8 i;
    for(i=0; i<8; i++){
        if( (winningBoards[i][0] != 0) &&
            (winningBoards[i][0] == winningBoards[i][1]) &&
            (winningBoards[i][0] == winningBoards[i][2])){
                return winningBoards[i][0];
        }
    }
    return 0;
}

u8 getBoardState(u8 row, u8 column)
{
    return g_boardState[row][column];
}

void setBoardState(u8 row, u8 column, u8 state)
{
    g_boardState[row][column] = state;
}

u8 isDraw(void)
{
    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                return 0;
            }
        }
    }
    return 1;
}

void dumpTable(score table[3][3])
{
    int i, j;
    for(i=0; i<3; i++) {
        printf("\n");
        for(j=0; j<3; j++){
            printf("%6i ", table[i][j]);
        }
    }
    printf("\n");
}

EPlayer playerSwitch(EPlayer player)
{
    if(player == EPlayerO) return EPlayerX;
    if(player == EPlayerX) return EPlayerO;
    return 0;
}

EMinMax modeSwitch(EMinMax mode)
{
    if(mode == EMaximizing) return EMinimizing;
    if(mode == EMinimizing) return EMaximizing;
    return 0;
}

然后这里有一个名为scoring的实际minimax算法:

score scoring(EMinMax mode, EPlayer player, u8 depth)
{
    score thisScore, tempScore;
    if(mode == EMaximizing){
        thisScore = -20;
        if(isWon()) return 15-depth;
    }
    if(mode == EMinimizing){
        thisScore = 20;
        if(isWon()) return depth-15;
    }
    if(isDraw()){
        return 0;
    }

    u8 i, j;
    for(i=0; i<3; i++){
        for(j=0; j<3; j++){
            if(getBoardState(i, j) == 0){
                setBoardState(i, j, player);
                tempScore = scoring(modeSwitch(mode), playerSwitch(player), depth+1);
                if((mode == EMaximizing) && (tempScore > thisScore)){
                    thisScore = tempScore;
                }
                if((mode == EMinimizing) && (tempScore < thisScore)){
                    thisScore = tempScore;
                }
                setBoardState(i, j, 0);
            }
        }
    }

    return thisScore;
}

最后一个函数在表格中打印分数以及main

void printSocredBoards(EPlayer player)
{   
    score thisScore[3][3] = {
        {STATE_00, STATE_01, STATE_02},
        {STATE_10, STATE_11, STATE_12},
        {STATE_20, STATE_21, STATE_22},
    };
    int i, j;
    if((isWon() == 0) && (isDraw() == 0)){
        for(i=0; i<3; i++){
            for(j=0; j<3; j++){
                if(getBoardState(i, j) == 0){
                    setBoardState(i, j, player);
                    thisScore[i][j] = scoring(EMaximizing, playerSwitch(player), 0);
                    setBoardState(i, j, 0);
                }
            }
        }
    }
    dumpTable(thisScore);
}

int main(int argc, char **argv)
{

    printSocredBoards(EPlayerO);

    return 0;
}

据我所知,这个算法应该可以正常工作,但它给了我一个荒谬的输出:

 7      7      7 
 7      0      7 
 7      7      7 

我错过了什么? 提前感谢您的帮助。

1 个答案:

答案 0 :(得分:1)

我认为问题出在这个问题的大部分代码中,你的案例会从正确的返回值中翻转出来:

if(mode == EMaximizing){
    thisScore = -20;
    if(isWon()) return 15-depth;
}
if(mode == EMinimizing){
    thisScore = 20;
    if(isWon()) return depth-15;
}

直观地说,问题是当scoring到达代码中的这一点时,对isWon的调用正在评估之前件放置的结果与mode的另一种选择。

例如,当使用EMaximizing调用评分并且已经赢得了董事会状态时,这意味着EMinimizing的玩家在此状态下获胜并且返回的分数应该反映这一点(即它应该是负面的)。由于当mode == EMaximizing始终为正时,深度达到最大值8,这不是您想要的。

当案例反转时,你的程序输出所有零,这似乎更合理,因为完美的玩家应该总是画画。我还测试了代码,并在printScoredBoards的顶部添加了以下行,以便将第一个游戏硬编码到左上角:

setBoardState(0, 0, playerSwitch(player));

这产生以下结果:

 0     10     10 
10      0     10 
10     10     10 

正确地识别出第二个玩家不能选择左上角并且如果他们以中心以外的任何东西作为他们的开场动作将会输掉。