Question

我在这里问了这个问题：How to convert occurence matrix to co-occurence matrix

我意识到我的数据太大了，以至于无法使用R执行此操作。我的计算机挂起。实际数据是一个包含约500万行和600列的文本文件。我认为Python可能是另一种选择。

Answer 1

这就是您将R代码翻译为Python代码的方式。

>>> import numpy as np
>>> a=np.array([[0, 1, 0, 0, 1, 1],
             [0, 0, 1, 1, 0, 1],
             [1, 1, 1, 1, 0, 0],
             [1, 1, 1, 0, 1, 1]])
>>> acov=np.dot(a.T, a)
>>> acov[np.diag_indices_from(acov)]=0
>>> acov
array([[0, 2, 2, 1, 1, 1],
       [2, 0, 2, 1, 2, 2],
       [2, 2, 0, 2, 1, 2],
       [1, 1, 2, 0, 0, 1],
       [1, 2, 1, 0, 0, 2],
       [1, 2, 2, 1, 2, 0]])

但是，您有一个非常大的数据集。如果您不想逐个组合共生矩阵并将值存储在int64中，使用3e + 9个数字，则仅需要24GB的RAM来保存数据http://www.wolframalpha.com/input/?i=3e9+ * + 8个+字节。因此，您可能需要考虑并确定要将数据存储在dtype中的http://docs.scipy.org/doc/numpy/user/basics.types.html：{{3}}。使用int16可能会使dot产品在现有的台式PC上运行。

如何在Python中将出现矩阵转换为共生矩阵

1 个答案: