设置数据

Question

我需要一些帮助来使Keras模型在RStudio中工作。当我有多个输入并使用数据生成器时，就会出现问题。

Keras报告输入错误。传递给模型的numpy数组与预期不符。

以下玩具示例重现了该问题。正常安装模型（没有数据生成器）后，它运行正常，但安装生成器时，它崩溃。

设置数据

library(magrittr)
library(keras)

# Create 10 examples of input data and 10 labels

input1 <- matrix(1:20,  ncol=2, nrow=10, byrow=T)   # [1,2; 3,4; 5,6 ... 19,20]
input2 <- matrix(1:30,  ncol=3, nrow=10, byrow=T)    # [1,2,3; 4,5,6 5,6,7 ... 28,29,30]
labels <- seq(0.1,1,0.1)                            # [0,1,0.2,0.3 ... 1.0]

构建并运行模型

# define input tensors for the two inputs
in_a <- layer_input(shape = c(2), name = "input1")
in_b <- layer_input(shape = c(3), name = "input2")

# concatenate the inputs and follow them by an output layer
out <- layer_concatenate(c(in_a, in_b), axis=-1, name="concat") %>% 
  layer_dense(units = 1, activation = 'linear', name="output")

# build the model
model <- keras_model(inputs = list(in_a, in_b), outputs = out)

#compile & run
model %>% compile(loss = "mse", optimizer = "adam")
model %>% fit(list(input1, input2), labels, epochs = 5)

带有生成器的模型

# The generator will alternatively select the first five input rows and then the second five ad infinitum
data_sample_generator <- function(input1, input2, labels) {

  first_five <- 1

  function() {

    first_five <<- ifelse(first_five == 0,1,0)

    if (first_five==0) {
      rows_to_return <- 1:5 }
    else {
      rows_to_return <- 6:10
    }
  return(list(input1[rows_to_return, ], input2[rows_to_return, ], labels[rows_to_return])) 
    }


}
# Examine generator output
batch <- data_sample_generator(input1, input2, labels)
batch()

# Examine generator output
batch <- data_sample_generator(input1, input2, labels)

batch()    # first sample
[[1]]
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
[4,]    7    8
[5,]    9   10

[[2]]
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12
[5,]   13   14   15

[[3]]
[1] 0.1 0.2 0.3 0.4 0.5

batch()    # second sample
[[1]]
     [,1] [,2]
[1,]   11   12
[2,]   13   14
[3,]   15   16
[4,]   17   18
[5,]   19   20

[[2]]
     [,1] [,2] [,3]
[1,]   16   17   18
[2,]   19   20   21
[3,]   22   23   24
[4,]   25   26   27
[5,]   28   29   30

[[3]]
[1] 0.6 0.7 0.8 0.9 1.0

这就是我期望从生成器中看到的。现在适合模型。


model %>% 
  fit_generator(data_sample_generator(input1,input2,labels),
                steps_per_epoch = 2,
                epochs = 5)

Error in py_call_impl(callable, dots$args, dots$keywords) :   
ValueError: Error when checking model input: the list of Numpy arrays that  
you are passing to your model is not the size the model expected.   
Expected to see 2 array(s),   
but instead got the following list of 1 arrays: 
[array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])]...

我不确定我在做什么错。如何固定发生器以提供正确形状的输入？感谢您的帮助。

修改发电机输出

按照@OID或建议，将输出更改为返回（[input1，input2]，标签）

return(list(list(input1[rows_to_return, ], input2[rows_to_return, ]), labels[rows_to_return]))

生成器返回

[[1]]
[[1]][[1]]
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
[4,]    7    8
[5,]    9   10

[[1]][[2]]
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12
[5,]   13   14   15

[[2]]
[1] 0.1 0.2 0.3 0.4 0.5

这次，我收到错误消息ValueError：无法将输入数组从形状（5,2）广播到形状（5）

Answer 1

生成器应返回一个tuple2d：

 (X, y)

在您的情况下，X是一个数组列表，因此它变为：

([X1, X2], y)

您的发电机产量：

list(input1[rows_to_return, ], input2[rows_to_return, ], labels[rows_to_return])

等效于：

([X1, X2, y])

我不认识R，但我认为您应该将生成器更改为此：

list(input1[rows_to_return, ], input2[rows_to_return, ]), labels[rows_to_return]

更新：

现在您已经更新了代码，传递给模型的Input形状为：

第一个输入：（5，2）

第二个输入：（5，3）

输出：（5）

Keras错误表明您的模型希望看到第一个输入为（例如）：

[1, 3, 5, 7, 9]

通过时：

[[1, 3, 5, 7, 9],
[2, 4, 6, 8, 10]]

因此，您应该更改批次生成器或模型的输入形状

如何使用多个输入为Keras模型编写数据生成器

设置数据

构建并运行模型

带有生成器的模型

修改发电机输出

1 个答案: