如何合并2个具有不同维数的ndarray?

时间:2019-03-27 16:47:25

标签: python numpy

我有一个(123,3072)数组,我需要将它分成5个大致相同的折叠(例如,因为123不能被5除),以便进行5倍交叉验证。不允许使用scikit-learn。我试图得到2个大小为(3,25,3072)和(2,24,3072)的ndarrays。现在我需要将它们结合起来,但是我尝试的每个函数都会引发这个问题:

ValueError: all the input array dimensions except for the concatenation 
axis must match exactly 

是否可以将它们串联起来?

这是我的代码:

num_folds = 5
mod = binary_train_X.shape[0] % num_folds
first_records = (binary_train_X.shape[0] - mod) // num_folds + 1
last_records = first_records - 1
first_part = binary_train_X[:mod * first_records].reshape([mod, first_records, -1])
second_part = binary_train_X[mod * first_records:].reshape([num_folds - mod, last_records, -1])
folds_X = np.concatenate((first_part, second_part))

或者也许有另一种方法可以将其分为5个部分(折叠)?

2 个答案:

答案 0 :(得分:0)

与此非常相似的东西。

def k_fold(array, num_folds): #New to WOS
    #Splits along axis 0 of array
    folds = []
    start = 0
    step = array.shape[0]/num_folds
    for i in range(num_folds):
        end = int(start + step)
        start = int(start)
        fold = array[start:end]
        rest_of_array = np.concatenate((array[:start],array[end:]), axis = 0)
        start = end
        folds.append((fold, rest_of_array))
    return folds

答案 1 :(得分:0)

由于377856 (123*3072)不能被15360 (5*3072)整除(123不能被5整除),因此只能通过截断或填充至15360 (5*3072)的倍数来创建5个相等的切片3072。

截断通过从末端丢弃值直到对齐来创建形状(5, 24, 3072)

folds = binary_train_X.flatten()[:np.prod(binary_train_X.shape)//(5*3072)*(5*3072)].reshape(5, -1, 3072)
# this discards 9216 (3072*3) values

填充通过在末尾附加零直到对齐来创建形状(5, 25, 3072)

folds = np.pad(binary_train_X.flatten(), (0, -(-np.prod(binary_train_X.shape)//(5*3072))*(5*3072)-np.prod(binary_train_X.shape)), 'constant').reshape(5, -1, 3072)
# this appends 6144 (3072*2) zeros