numpy中的内存错误

时间:2017-12-17 18:15:49

标签: python numpy categorical-data

我正在尝试使用numpy为我的个人项目构建此转换器并获取内存错误。我是python的新手。这适用于小数据但在我输入5MB数据(附加数据)时中断。这是代码。专家可以指出内存在哪里爆炸吗?可以找到here

的数据链接
import numpy as np
import gc as gc

"""
USAGE: convert(data,cols)
        data - numpy array of data
        cols - tuple of columns to process. These columns should be categorical columns. 
        IMP: Indexing of colum in data starts with 0. Ypou cant index last column.

        Ex: you want to index second col here, then

        data
        a b c
        a b c
        x y z

        cols=(1,)

        if you want to index 1st and second, then

        cols=(0,1)

        All 3

        cols=(0,1,2)

        You can also skip numeric column, which you dont want to encode, like

        cols=(0,2) will skip 1 col

"""

def lookupBuilder(strArray):
    a=np.arange(len(strArray))+1
    lookups={k:v for (k,v) in zip(strArray,a)}
    return lookups

def convert(data,cols):    

    for ix,i in  enumerate(cols):
        col=data[:,i:i+1]
        lookup_data=lookupBuilder(np.unique(col))

        for idx,value in enumerate(col):
            col[idx]=lookup_data[value[0]]

        np.delete(data,i,1)
        gc.collect()
        np.insert(data,i,col,axis=1)  

    return data


if __name__=="__main__":    
    pass

错误

Traceback (most recent call last): 
File "C:\MLDatabases\python_scripts\MLP.py", line 230, in <module>
  data=cc.convert(data,(1,2,3,4,5,6,7,8,9,13,19)) 
File "C:\MLDatabases\python_scripts\categorical_converter.py", line 49, in convert 
  np.insert(data,i,col,axis=1) 
File "C:\python\lib\site-packages\numpy\lib\function_base.py", line 4906, in insert 
  new = empty(newshape, arr.dtype, arrorder) 
MemoryError

0 个答案:

没有答案