我目前正在尝试为tic tac toe实现minimax算法,但我不确定如何在获得所有game_states的最小值/最大值后找出如何进行移动。我知道你应该看看哪条路的获胜次数最多,但我不知道从哪里开始。
def minimax(game_state):
if game_state.available_moves():
return evaluate(game_state)
else:
return max_play(game_state)
def evaluate(game_state):
if game_state.has_won(game_state.next_player):
return 1
elif game_state.has_won(game_state.opponent()):
return -1
else:
return 0
def min_play(game_state):
if game_state.available_moves() == []:
return evaluate(game_state)
else:
moves = game_state.available_moves()
best_score = -1
for move in moves:
clone = game_state.make_move(move)
score = max_play(clone)
if score < best_score:
best_move = move
best_score = score
return best_score
def max_play(game_state):
if game_state.available_moves() == []:
return evaluate(game_state)
else:
moves = game_state.available_moves()
best_score = 1
for move in moves:
clone = game_state.make_move(move)
score = min_play(clone)
if score > best_score:
best_move = move
best_score = score
return best_score
答案 0 :(得分:1)
顶层真的很简单 - 所有你需要记住的是当前搜索深度的最佳移动,如果你完全评估深度,那么将最好的设置为最佳深度;并尝试用更深的树再次评估。顺便说一句,最大数量的胜利并不重要,胜利就是胜利。
案例的伪代码:
bestest_move = None
try:
for depth in range(1, max_depth):
best_score = float('-inf')
for move in possible_moves:
score = evaluate(move)
if score > best_score:
best_move = move
best_score = score
bestest_move = best_move
except Timeout:
pass
move(bestest_move)