Question

我正在使用Python为2048游戏编写AI。它比我预期的要慢很多。我将深度限制设置为5，仍然需要几秒钟才能得到答案。起初我认为我所有函数的实现都是垃圾，但我找出了真正的原因。搜索树上的叶子数量甚至超过了甚至可能存在的叶子数量。

这是一个典型的结果（我计算了叶子，分支和扩展数量）：

111640 leaves, 543296 branches, 120936 expansions
Branching factor: 4.49242574585
Expected max leaves = 4.49242574585^5 = 1829.80385192 leaves

和另一个，好的衡量标准：

99072 leaves, 488876 branches, 107292 expansions
Branching factor: 4.55650001864
Expected max leaves = 4.55650001864^5 = 1964.06963743 leaves

正如您所看到的，搜索树上的叶子数量多于使用天真极小极大时的叶子数量。这里发生了什么？我的算法发布在下面：

# Generate constants
import sys
posInfinity = sys.float_info.max
negInfinity = -sys.float_info.max

# Returns the direction of the best move given current state and depth limit
def bestMove(grid, depthLimit):
    global limit
    limit = depthLimit
    moveValues = {}
    # Match each move to its minimax value
    for move in Utils2048.validMoves(grid):
        gridCopy = [row[:] for row in grid]
        Utils2048.slide(gridCopy, move)
        moveValues[move] = minValue(grid, negInfinity, posInfinity, 1)
    # Return move that have maximum value
    return max(moveValues, key = moveValues.get)

# Returns the maximum utility when the player moves
def maxValue(grid, a, b, depth):
    successors = Utils2048.maxSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = negInfinity
    for successor in successors:
        value = max(value, minValue(successor, a, b, depth + 1))
        if value >= b:
            return value
        a = max(a, value)
    return value
# Returns the minimum utility when the computer moves
def minValue(grid, a, b, depth):
    successors = Utils2048.minSuccessors(grid)
    if len(successors) == 0 or limit < depth:
        return Evaluator.evaluate(grid)
    value = posInfinity
    for successor in successors:
        value = min(value, maxValue(successor, a, b, depth + 1))
        if value <= a:
            return value
        b = min(b, value)
    return value

有人请帮帮我。我多次查看这段代码，但我无法确定错误。

Answer 1

很显然，您正在将value与bβ和aα进行比较。您的代码中的比较如下：

def maxValue(grid, a, b, depth):
    .....
    .....
        if value >= b:
            return value
        a = max(a, value)
    return value

和

def minValue(grid, a, b, depth):
    .....
    .....
        if value <= a:
            return value
        b = min(b, value)
    return value

但是，进行alpha-beta修剪的条件是，只要alpha增长超过beta，即alpha> beta，我们就无需遍历搜索树。

因此，应该是：

def maxValue(grid, a, b, depth):
    ....
    .....
        a = max(a, value)
        if a > b:
            return value

    return value

和

def minValue(grid, a, b, depth):
    .....
    .....
        b = min(b, value)
        if b < a:
            return value

    return value

这是您遗失的一种极端情况，因为aα和bβ（不一定）等于value。

2048年Alpha Beta的问题

1 个答案: