Question

我正在做一个关于OpenAIgym Blackjack的项目，但是规则似乎有点奇怪。我们有一个元组：

玩家当前总和0,1，...，31。
发牌人正面朝上的卡1，...，10。
玩家是否有可用的ace（否= 0，是= 1）。

代理有两个动作：

棒= 0
hit = 1

奖励：

+1：获胜
-1：松动

0：绘图

for i_episode in range(3):  
    state = env.reset() 
    while True:  
        print(state) 
        action = env.action_space.sample()
        print(action)
        state, reward, done, info = env.step(action)
        if done:
            print('End game! Reward: ', reward)
            print('You won :)\n') if reward > 0 else print('You lost :(\n')
            break

这是我们启动游戏时发生的事情：

(18, 7, False)
0
End game! Reward:  0.0
You lost :(

(19, 7, False)
0
End game! Reward:  1.0
You won :)

(18, 8, False)
1
End game! Reward:  -1
You lost :(

我不太了解，就像第一场比赛一样；我们总共有18张，庄家面朝上的卡是7张，我们散了吗？我们不知道经销商的总和是多少都没有道理-应该大于18-但我们只是不知道

二十一点OpenAIGym规则

0 个答案: