Question

我有一个python字典，格式如下：

data[author1][author2] = 1

这个字典包含每个可能的作者对的条目（所有8500位作者对），我需要为所有作者对输出一个如下所示的矩阵：

        "auth1" "auth2" "auth3" "auth4" ...
"auth1"    0       1       0       3
"auth2"    1       0       2       0
"auth3"    0       2       0       1       
"auth4"    3       0       1       0
...

我尝试过以下方法：

x = numpy.array([[data[author1][author2] for author2 in sorted(data[author1])] for author1 in sorted(data)])
print x
outf.write(x)

然而，打印这个让我留下了这个：

[[0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 ..., 
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]
 [0 0 0 ..., 0 0 0]]

，输出文件只是一个空白文本文件。我试图以一种读入Gephi（https://gephi.org/users/supported-graph-formats/csv-format/）

的方式格式化输出

Answer 1

你几乎做对了，你的列表理解被颠倒了。这将为您提供预期的结果：

d = dict(auth1=dict(auth1=0, auth2=1, auth3=0, auth4=3),
         auth2=dict(auth1=1, auth2=0, auth3=2, auth4=0),
         auth3=dict(auth1=0, auth2=2, auth3=0, auth4=1),
         auth4=dict(auth1=3, auth2=0, auth3=1, auth4=0))

np.array([[d[i][j] for i in sorted(d.keys())] for j in sorted(d[k].keys())])
#array([[0, 1, 0, 3],
#       [1, 0, 2, 0],
#       [0, 2, 0, 1],
#       [3, 0, 1, 0]])

Answer 2

您可以使用pandas。使用@Saullo Castro输入：

import pandas as pd        
df = pd.DataFrame.from_dict(d)

结果：

>>> df
       auth1  auth2  auth3  auth4
auth1      0      1      0      3
auth2      1      0      2      0
auth3      0      2      0      1
auth4      3      0      1      0

如果你想保存，你可以df.to_csv(file_name)

从字典中输出python中的大矩阵

2 个答案: