我正在尝试了解人工智能以及如何在程序中实现它。最简单的起点可能是简单的游戏(在这种情况下是Tic-Tac-Toe)和游戏搜索树(递归调用;不是实际的数据结构)。 I found this关于该主题的讲座非常有用的视频。
我遇到的问题是对算法的第一次调用需要花费很长时间(大约15秒)才能执行。我已经在整个代码中放置了调试日志输出,似乎它调用了算法的一部分过多次。
以下是为计算机选择最佳移动的方法:
public Best chooseMove(boolean side, int prevScore, int alpha, int beta){
Best myBest = new Best();
Best reply;
if (prevScore == COMPUTER_WIN || prevScore == HUMAN_WIN || prevScore == DRAW){
myBest.score = prevScore;
return myBest;
}
if (side == COMPUTER){
myBest.score = alpha;
}else{
myBest.score = beta;
}
Log.d(TAG, "Alpha: " + alpha + " Beta: " + beta + " prevScore: " + prevScore);
Move[] moveList = myBest.move.getAllLegalMoves(board);
for (Move m : moveList){
String choice;
if (side == HUMAN){
choice = playerChoice;
}else if (side == COMPUTER && playerChoice.equals("X")){
choice = "O";
}else{
choice = "X";
}
Log.d(TAG, "Current Move: column- " + m.getColumn() + " row- " + m.getRow());
int p = makeMove(m, choice, side);
reply = chooseMove(!side, p, alpha, beta);
undoMove(m);
if ((side == COMPUTER) && (reply.score > myBest.score)){
myBest.move = m;
myBest.score = reply.score;
alpha = reply.score;
}else if((side == HUMAN) && (reply.score < myBest.score)){
myBest.move = m;
myBest.score = reply.score;
beta = reply.score;
}//end of if-else statement
if (alpha >= beta) return myBest;
}//end of for loop
return myBest;
}
如果点是空的,makeMove
方法移动并返回一个值(-1 - 人类获胜,0 - 抽奖,1 - 计算机获胜,-2或2 - 否则)。虽然我认为错误可能在getAllLegalMoves
方法中:
public Move[] getAllLegalMoves(String[][] grid){
//I'm unsure whether this method really belongs in this class or in the grid class, though, either way it shouldn't matter.
items = 0;
moveList = null;
Move move = new Move();
for (int i = 0; i < 3; i++){
for(int j = 0; j < 3; j++){
Log.d(TAG, "At Column: " + i + " At Row: " + j);
if(grid[i][j] == null || grid[i][j].equals("")){
Log.d(TAG, "Is Empty");
items++;
if(moveList == null || moveList.length < items){
resize();
}//end of second if statement
move.setRow(j);
move.setColumn(i);
moveList[items - 1] = move;
}//end of first if statement
}//end of second loop
}//end of first loop
for (int k = 0; k < moveList.length; k++){
Log.d(TAG, "Count: " + k + " Column: " + moveList[k].getColumn() + " Row: " + moveList[k].getRow());
}
return moveList;
}
private void resize(){
Move[] b = new Move[items];
for (int i = 0; i < items - 1; i++){
b[i] = moveList[i];
}
moveList = b;
}
总结一下:我的电话是什么,选择最好的举动,花了这么长时间?我错过了什么?有没有更简单的方法来实现此算法?任何帮助或建议将不胜感激,谢谢!
答案 0 :(得分:7)
具有alpha beta修剪的minimax树应该可视化为树,树的每个节点都是可能的移动,许多转向未来,其子节点可以从中获取所有移动。
为了尽可能快,并保证你只需要前方向移动数量的空间线性,你需要进行深度优先搜索并从一侧进行“扫描”。如果你想象整个树正在构建中,你的程序实际上只会一次构建一个从一个到一个根的单个链,并丢弃它完成的任何部分。
我现在要复制维基百科的伪代码,因为它真的非常简洁明了:
function alphabeta(node, depth, α, β, Player)
if depth = 0 or node is a terminal node
return score
if Player = MaxPlayer
for each child of node
α := max(α, alphabeta(child, depth-1, α, β, not(Player) ))
if β ≤ α
break (* Beta cut-off *)
return α
else
for each child of node
β := min(β, alphabeta(child, depth-1, α, β, not(Player) ))
if β ≤ α
break (* Alpha cut-off *)
return β
注意:
- '对于节点的每个子节' - 不是编辑当前板的状态,而是创建一个全新的板,这是应用移动的结果。通过使用不可变对象,您的代码一般来说,它不会容易出错,也更容易推理。
- 要使用此方法,请为当前状态下的每个可能移动调用它,为其提供深度-1,-Afinity为alpha和+ Infinity为beta,它应该从非移动玩家开始转入每个调用 - 返回最高值的调用是最好的调用。
这在概念上非常简单。如果你正确编码,那么你永远不会同时实例化多个(深度)板,你永远不会考虑无意义的分支等等。
答案 1 :(得分:0)
我不会为你描述你的代码,但因为这是一个很好的编码kata我写了一个小的ai tic tac toe:
import java.math.BigDecimal;
public class Board {
/**
* -1: opponent
* 0: empty
* 1: player
*/
int[][] cells = new int[3][3];
/**
* the best move calculated by eval(), or -1 if no more moves are possible
*/
int bestX, bestY;
int winner() {
// row
for (int y = 0; y < 3; y++) {
if (cells[0][y] == cells[1][y] && cells[1][y] == cells[2][y]) {
if (cells[0][y] != 0) {
return cells[0][y];
}
}
}
// column
for (int x = 0; x < 3; x++) {
if (cells[x][0] == cells[x][1] && cells[x][1] == cells[x][2]) {
if (cells[x][0] != 0) {
return cells[x][0];
}
}
}
// 1st diagonal
if (cells[0][0] == cells[1][1] && cells[1][1] == cells[2][2]) {
if (cells[0][0] != 0) {
return cells[0][0];
}
}
// 2nd diagonal
if (cells[2][0] == cells[1][1] && cells[1][1] == cells[0][2]) {
if (cells[2][0] != 0) {
return cells[2][0];
}
}
return 0; // nobody has won
}
/**
* @return 1 if side wins, 0 for a draw, -1 if opponent wins
*/
int eval(int side) {
int winner = winner();
if (winner != 0) {
return side * winner;
} else {
int bestX = -1;
int bestY = -1;
int bestValue = Integer.MIN_VALUE;
loop:
for (int y = 0; y < 3; y++) {
for (int x = 0; x < 3; x++) {
if (cells[x][y] == 0) {
cells[x][y] = side;
int value = -eval(-side);
cells[x][y] = 0;
if (value > bestValue) {
bestValue = value;
bestX = x;
bestY = y;
if (bestValue == 1) {
// it won't get any better, we might as well stop thinking
break loop;
}
}
}
}
}
this.bestX = bestX;
this.bestY = bestY;
if (bestValue == Integer.MIN_VALUE) {
// there were no moves left, it must be a draw!
return 0;
} else {
return bestValue;
}
}
}
void move(int side) {
eval(side);
if (bestX == -1) {
return;
}
cells[bestX][bestY] = side;
System.out.println(this);
int w = winner();
if (w != 0) {
System.out.println("Game over!");
} else {
move(-side);
}
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
char[] c = {'O', ' ', 'X'};
for (int y = 0; y < 3; y++) {
for (int x = 0; x < 3; x++) {
sb.append(c[cells[x][y] + 1]);
}
sb.append('\n');
}
return sb.toString();
}
public static void main(String[] args) {
long start = System.nanoTime();
Board b = new Board();
b.move(1);
long end = System.nanoTime();
System.out.println(new BigDecimal(end - start).movePointLeft(9));
}
}
精明的读者会注意到我不使用alpha / beta截止。不过,在我有点过时的笔记本上,这会在0.015秒内完成游戏......
没有对您的代码进行分析,我无法确定问题是什么。但是,在搜索树中的每个节点上记录每个可能的移动可能与它有关。