如何解决:“生成器中发生错误:下标超出范围”

时间:2019-01-10 15:34:47

标签: r keras time-series

我一直在Keras中研究神经网络。当尝试应用递归神经网络时,我偶然发现了代码的蓝图,但是当实现代码并尝试根据自己的需要调整代码时,我总是会收到错误:

Error occurred in generator: subscript out of bounds
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  StopIteration: 

Detailed traceback: 
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
    generator_output = next(output_generator)
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 23, in __next__
    return self.next()
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 40, in next
    raise StopIteration()

我正在使用的数据帧只是一个变量的时间序列。我怀疑发电机是罪魁祸首,但我不确定100%。

很高兴感谢你们的帮助。

我曾尝试过不同版本的fit_generator()函数,但每个人都会抛出相同的错误。

generator <- function(data, lookback, delay, min_index, max_index, shuffle = FALSE, batch_size = 128, step = 2) {
    if (is.null(max_index)) max_index <- nrow(data) - delay -   1
   i <- min_index + lookback
   function() {
     if (shuffle) {
       rows <- sample(c((min_index+lookback):max_index),size = batch_size)
     } else {
       if (i + batch_size >= max_index)
         i <<- min_index + lookback
       rows <- c(i:min(i+batch_size, max_index))
       i <<- i + length(rows)
 }
     samples <- array(0, dim = c(length(rows),
                                 lookback / step,
                                 dim(data)[[-1]]))
     targets <- array(0, dim = c(length(rows)))
     for (j in 1:length(rows)) {
       indices <- seq(rows[[j]] - lookback+1, rows[[j]],
                      length.out = dim(samples)[[2]])
       samples[j,,] <- data[indices,]
       targets[[j]] <- data[rows[[j]] + delay,2]
     }
     list(samples, targets)
   }
 }

lookback <- 30
step <- 2
delay <- 365
batch_size <- 128 

train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = nrow(data),
  shuffle = TRUE,
step = step, 
  batch_size = batch_size
)
val_gen = generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.6)+1,
  max_index = floor(nrow(lightning_ts_red)*0.8),
  step = step,
  batch_size = batch_size
) 
test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.8)+1,
  max_index = NULL,
  step = step,
  batch_size = batch_size


test_steps <- (nrow(lightning_ts_red) - floor(nrow(lightning_ts_red)*0.8)+1 - lookback) / batch_size


val_steps <- (floor(nrow(lightning_ts_red)*0.8) - floor(nrow(data)*0.6)+1 - lookback) / batch_size
history <- model %>% fit_generator(
train_gen,
steps_per_epoch=500,
epochs=20,
validation_data= val_gen,
validation_steps = val_steps,
verbose=1, view_metrics="auto")

2 个答案:

答案 0 :(得分:0)

您要给出train_gen的数据总数,最好是对数据进行划分并明确给出train_gen的行数。

我做了以下工作并为我工作。我有8614行的时间序列数据集:

train_gen <- generator(
data, 
lookback= lookback,
delay = delay,
min_index = 1,
max_index= 5500, # you say that the first 5500 value consider as a training data
shuffle = TRUE,
step = step,
batch_size = batch_size
)

val_gen <- generator(
data, 
lookback= lookback,
delay = delay,
min_index = 5501, 
max_index= 7000,
step = step,
batch_size = batch_size
)

test_gen <- generator(
data, 
lookback= lookback,
delay = delay,
min_index = 7001,
max_index= NULL,
step = step,
batch_size = batch_size
)

## How many steps to draw from val_gen in order to see the entire validation set
val_step <- (7000 - 5501 - lookback) / batch_size
test_step <- (nrow(data) - 7001 - lookback) / batch_size

答案 1 :(得分:0)

看看生成器函数中的这一行:

targets[[j]] <- data[rows[[j]] + delay,2]

第二个参数 2 定义了要预测的数据列。在原始示例中(来自 Chollet 和 Allaire),摄氏温度位于第二列 ('T (degC)'),这就是他们试图预测的内容。

如果您正在处理单变量数据,那么您只有一列,因此生成器函数将抛出“下标越界”错误。

您应该将其更改为 1(如下所示),它应该可以正常工作。

targets[[j]] <- data[rows[[j]] + delay,1]