如何解决此 MemoryError
问题?
我在train3.csv
.train()
来电失败。
我有4GB的DDR3内存。
有没有办法让MemoryError
像{}}}那样在其他培训方法上失败或者以某种方式增加我的虚拟内存(我在Windows 10上)?
代码:
train_file = 'train3.csv'
netsave_file = 'neurolab.net'
hidden_units = 440
outputs = 1
import numpy as np
import neurolab as nl
# read training data and put it into numpy array _______________________
t = []
t_file = open(train_file, 'r')
for line in t_file.readlines():
train = line.split(',')
train[1] = int(train[1])
for i in range(0,72):
train[i+2] = float(train[i+2]) # convert to floats
t.append(train)
t_file.close()
print "training samples read: " + str(len(t))
input = []
target = []
for train in t:
input.append(train[2:2+72])
target.append(train[1:2])
print "done reading input and target"
train = 0
input = np.array(input)
target = np.array(target)
print "done converting input and target to numpy array"
net = nl.net.newff([[0.0,1.0]]*72, [hidden_units,144,outputs])
# Train process _______________________________________________________
err = net.train(input, target, show=1, epochs = 2)
net.save(netsave_file)
显示此错误:
Traceback (most recent call last):
File "neurolab_train.py", line 43, in <module>
err = net.train(input, target, show=1, epochs = 2)
File "C:\Users\tintran\Anaconda\lib\site-packages\neurolab\core.py", line 165, in train
return self.trainf(self, *args, **kwargs)
File "C:\Users\tintran\Anaconda\lib\site-packages\neurolab\core.py", line 349, in __call__
train(net, *args)
File "C:\Users\tintran\Anaconda\lib\site-packages\neurolab\train\spo.py", line 79, in __call__
**self.kwargs)
File "C:\Users\tintran\Anaconda\lib\site-packages\scipy\optimize\optimize.py", line 782, in fmin_bfgs
res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
File "C:\Users\tintran\Anaconda\lib\site-packages\scipy\optimize\optimize.py", line 840, in _minimize_bfgs
I = numpy.eye(N, dtype=int)
File "C:\Users\tintran\Anaconda\lib\site-packages\numpy\lib\twodim_base.py", line 231, in eye
m = zeros((N, M), dtype=dtype)
MemoryError
答案 0 :(得分:0)
numpy
为我提供了一个救生夹克 ...具有@ numba.jit()的奖励力量 遇到类似动机的麻烦,我花了一些时间来寻找如何逃避2GB天花板的方法(由O / S维持的最大值引入的另一个危险是Process可以分配为Private Bytes
,之后整个Anaconda被Windows O / S和所有经过培训和调整的机器学习实例(读取几十个CPU核心小时)中止了。
finally:
numpy.memmap()
的典型用法是对问题进行剪切/粘贴,对速度/数值(im)精度和fileIO
操作进行观察和求解。在特征工程期间,@numba.jit()
正在大型阵列上被大量使用,因此您可以从这种方法中受益,以进一步加快处理速度 - 非常感谢Travis OLIPHANT的团队。
with open( getCsvFileNAME( anFxCTX[aCtxID] ), "r" ) as aFH:
# ------------------------------------------------------------- # .memmap
DATA = np.memmap( getMmapFileNAME( anFxCTX[aCtxID] ),
mode = 'w+', # 'readwrite', # 'w+' <----------------------------IO Error: ( In WIN, not UX, if file is already open with another filehandle ... ) >>> https://github.com/spacetelescope/pyasdf/issues/100
shape = ( getFileRowCOUNT( aFH ), 7 ), # .shape
#---------.float64 ----------------------------------------------------------------------------------------14-----------------------------------------------------------------------------------
#type = np.float64 # .dtype np.float64 are 8B-IEEE-float precision overly enough w 14/15 significant digits in 56-bit mantissa
#---------.float64 ----------------------------------------------------------------------------------------14-----------------------------------------------------------------------------------
#============================================================================================================
#---------.float32 -----------------------------------------------------------------------------------------7-----------------------------------------------------------------------------------
dtype = np.float32 # .dtype np.float32 are 4B-IEEE-float precision fairly enough w 7/ 8 significant digits in 23-bit mantissa [[[ BUT may crash @.jit f8 SIGNATURES]]] 0.19
#---------.float32 -----------------------------------------------------------------------------------------7-----------------------------------------------------------------------------------
) # np.float64 SHALL BE kept here, for DATA, as this precision keeps convoluted calculus farther from numerical error propagation ( not the case for X_ and y_ that enter into SKLEARN )
# ------------------------------------------------------------- # .genfromtxt assignment into .memmap is elementwise
DATA[:,:] = np.genfromtxt( aFH,
skip_header = 0,
delimiter = ",",
# v v v v v v
# 2011.08.30,12:00,1791.20,1792.60,1787.60,1789.60,835
# 2011.08.30,13:00,1789.70,1794.30,1788.70,1792.60,550
# 2011.08.30,14:00,1792.70,1816.70,1790.20,1812.10,1222
# 2011.08.30,15:00,1812.20,1831.50,1811.90,1824.70,2373
converters = { 0: lambda aString: mPlotDATEs.date2num( datetime.datetime.strptime( aString, "%Y.%m.%d" ) ), #_______________________________________asFloat ( 1.0, +++ )
1: lambda aString: ( ( int( aString[0:2] ) * 60 + int( aString[3:] ) ) / 60. / 24. ) # ( 15*60 + 00 ) / 60. / 24.__asFloat < 0.0, 1.0 )
# HH: :MM HH MM
}
)[:,:] # -------------------------- # .memmap assigned elementwise
您可能会觉得convertors
内的lambda- numpy.genfromtxt()
的强大功能可以帮助您进行.CSV解析 - 在设计阶段和设计阶段都很舒适。在代码执行阶段速度快。
答案 1 :(得分:0)
原因是我在训练时遇到内存错误,因为我使用的是32位Python。
现在我升级到Python 64位,一切都很好。
我甚至可以使我的网络足够大以挂起我的系统(这意味着现在64位Python上没有限制)。
只需要找到一个快乐的媒介(改变神经网络大小),这样我的系统就不会挂起。