我正在尝试计算数据帧中每两个可能的两列对的皮尔逊相关性。我有57997列。但我收到了内存错误。
t_logs = logs.T
print t_logs
results = t_logs.corr(method='pearson').applymap
print results[enter image description here][1]
这是追溯
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-99-d0010a131d17> in <module>()
5 print logs
6
----> 7 results = t_logs.corr(method='pearson')
8 print results
C:\Users\nne1s\Anaconda2\lib\site-packages\pandas\core\frame.pyc in
corr(self, method, min_periods)
4938
4939 if method == 'pearson':
-> 4940 correl = libalgos.nancorr(_ensure_float64(mat),
minp=min_periods)
4941 elif method == 'spearman':
4942 correl = libalgos.nancorr_spearman(_ensure_float64(mat),
pandas\_libs\algos.pyx in pandas._libs.algos.nancorr
(pandas\_libs\algos.c:15501)()
MemoryError:
答案 0 :(得分:0)
我认为你不想计算57997系列的相关系数。
如果您正在尝试获取Log2的平均值,STD的Log2等的相关矩阵,请在运行corr之前转置数据帧
t_logs = t_logs.T
results = t_logs.corr(method='pearson')
我不确定你想在applymap
这里做什么