这听起来很容易,我不知道该怎么做。
我有numpy 2d数组
X = (1783,30)
我希望将它们分批分成64个。我这样编写代码。
batches = abs(len(X) / BATCH_SIZE ) + 1 // It gives 28
我正在尝试批量预测结果。所以我用零填充批次,然后用预测结果覆盖它们。
predicted = []
for b in xrange(batches):
data4D = np.zeros([BATCH_SIZE,1,96,96]) #create 4D array, first value is batch_size, last number of inputs
data4DL = np.zeros([BATCH_SIZE,1,1,1]) # need to create 4D array as output, first value is batch_size, last number of outputs
data4D[0:BATCH_SIZE,:] = X[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE,:] # fill value of input xtrain
#predict
#print [(k, v[0].data.shape) for k, v in net.params.items()]
net.set_input_arrays(data4D.astype(np.float32),data4DL.astype(np.float32))
pred = net.forward()
print 'batch ', b
predicted.append(pred['ip1'])
print 'Total in Batches ', data4D.shape, batches
print 'Final Output: ', predicted
但是在最后一批28号中,只有55个元素而不是64个(总元素1783),它给出了
ValueError: could not broadcast input array from shape (55,1,96,96) into shape (64,1,96,96)
对此有什么解决方法?
PS:网络预测要求确切的批量大小为64来预测。
答案 0 :(得分:4)
我也不太明白你的问题,尤其是X的样子。 如果要创建数组大小相同的子组,请尝试以下操作:
def group_list(l, group_size):
"""
:param l: list
:param group_size: size of each group
:return: Yields successive group-sized lists from l.
"""
for i in xrange(0, len(l), group_size):
yield l[i:i+group_size]
答案 1 :(得分:1)
我找到了一种简单的方法来解决批次问题,方法是生成虚拟对象,然后填写必要的数据。
data = np.zeros(batches*BATCH_SIZE,1,96,96)
// gives dummy 28*64,1,96,96
此代码将正好加载64个批量大小的数据。最后一批最后会有虚拟零,但没关系:)
pred = []
for b in batches:
data4D[0:BATCH_SIZE,:] = data[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE,:]
pred = net.predict(data4D)
pred.append(pred)
output = pred[:1783] // first 1783 slice
最后,我将总共28 * 64的1783个元素切成薄片。这对我有用,但我确信有很多方法。
答案 2 :(得分:0)
data4D[0:BATCH_SIZE,:]
应为data4D[b*BATCH_SIZE:b*BATCH_SIZE+BATCH_SIZE, :]
。
答案 3 :(得分:0)
这可以使用 numpy 的 as_strided
来实现。
from numpy.lib.stride_tricks import as_strided
def batch_data(test, batch_size):
m,n = test.shape
S = test.itemsize
if not batch_size:
batch_size = m
count_batches = m//batch_size
# Batches which can be covered fully
test_batches = as_strided(test, shape=(count_batches, batch_size, n), strides=(batch_size*n*S,n*S,S)).copy()
covered = count_batches*batch_size
if covered < m:
rest = test[covered:,:]
rm, rn = rest.shape
mismatch = batch_size - rm
last_batch = np.vstack((rest,np.zeros((mismatch,rn)))).reshape(1,-1,n)
return np.vstack((test_batches,last_batch))
return test_batches