Question

我们假设我们有一个类似下面的数据框。

Games    Players    Score
0            Foo      100
             Bar       10
             Baz        5
1           Blah       30
             Bar       10
             Foo        2
2            Foo       40
             Fes        5
             ...

我希望能够处理它来构建一个新的数据帧（矩阵），我们有：

pairwise_comparisons.loc[A, B] = W / T

带

W = # of games where A ended up with higher score than B
T = # of games in which they both participated

我该如何解决这个问题？

例如，仅使用上面显示的数据，我们将按如下方式填充矩阵：

pairwise_comparisons.loc['Foo', 'Bar'] = 1/2

因为Foo和Bar在游戏0和1（2场比赛）和Foo中赢了1场比赛（游戏{{1} }），所以W / T = 1/2。

我当然可以手动循环每对玩家并在每场比赛中比较他们的得分，但这可能会很慢。关于如何矢量化解决方案的任何想法？

以上的变体是当我们尝试计算0时我们可以存储他们都参与的游戏中A和B之间得分的中位数差异。

Answer 1

设置

s = pd.Series({
        (0, 'Bar'): 10,
        (0, 'Baz'): 5,
        (0, 'Foo'): 100,
        (1, 'Bar'): 10,
        (1, 'Blah'): 30,
        (1, 'Foo'): 2,
        (2, 'Fes'): 5,
        (2, 'Foo'): 40
    })

df = s.unstack()
v = df.values
m, n = v.shape
nrng = np.arange(n)

# who played who
played = (~np.isnan(v))
played_3d = played.reshape(m, 1, n) & played.reshape(m, n, 1)
played_3d[:, nrng, nrng] = False

# who beat who
scores = np.where(played, v, -1)
winners = np.where(
    played_3d,
    scores.reshape(m, 1, n) > scores.reshape(m, n, 1),
    0
)

# how many times have we played eachother
games_played = (played_3d).sum(0)
games_won = winners.sum(0)

pairwise = np.empty((n, n), dtype=np.float)
pairwise.fill(np.nan)
r, c = np.where(games_played != 0)
pairwise[r, c] = games_won[r, c] / games_played[r, c]
pairwise_comparisons = pd.DataFrame(pairwise, df.columns, df.columns).stack()

pairwise_comparisons.loc['Foo', 'Bar']

0.5

来自多人游戏的1-vs-1比较

1 个答案: