在树莓Pi 3的模型训练期间Keras segmetnation错误

时间:2017-09-30 21:43:47

标签: python tensorflow segmentation-fault keras raspberry-pi3

我在RaspberryPi 3上运行keras程序并获得以下分段失败。该程序在我的笔记本电脑上完美运行使用pip安装和升级了keras。在使用适用于Raspberry Pi 3的.whl之前安装了Tensorflow。

Using TensorFlow backend.
Compiling Labels file...
Labels file compiled!
Loading pretrained model...
Adding additional layers...
Compiling new model file...
Model compiled!
Found 408 images belonging to 5 classes.
Found 87 images belonging to 5 classes.
Training model...
Epoch 1/2

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.281032] Internal error: Oops: 5 [#2] SMP ARM

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.385182] Process python (pid: 2199, stack limit = 0xb5382210)

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.391273] Stack: (0xb5383df0 to 0xb5384000)

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.395690] 3de0:                                     b5383e1c b5383e00 80152ac8 8014cc54

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.403985] 3e00: ace00004 b65d4cd8 b65d4cd8 b6552960 b5383ea4 b5383e20 8012e588 80152a84

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.412279] 3e20: b5383e6c b5383e30 800894b4 8048d910 4b71ddf2 20000113 b601d500 20000113

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.420574] 3e40: b601d500 80869e00 80869e00 a7f77000 00000000 00000040 80869e00 b5802800

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.428869] 3e60: 80869e00 60000113 60000113 801306dc 6a6ff000 ace00000 b5383ea4 b5383fb0

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.437164] 3e80: ace00000 00000817 6a6ff000 a7df1c00 a7df1c38 00000055 b5383efc b5383ea8

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.445459] 3ea0: 805b9808 8012d680 805bde5c 8085d3c0 8086060c 8086a080 b5383ee4 b5383ec8

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.453754] 3ec0: 80028c74 00000000 00000800 00000000 00000009 80865584 00000817 805b94c8

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.462049] 3ee0: 6a6ff000 b5383fb0 00000000 00000000 b5383fac b5383f00 800091e8 805b94d4

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.470345] 3f00: 0000000a 60000193 00000000 00000000 01400000 00000000 6b9f71d0 3fcc5a7f

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.478639] 3f20: b5383f4c b5383f30 8007d258 800d96c8 b601cdc0 8085a4ec 00000000 00000000

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.486933] 3f40: b5383f64 b5383f50 800295e8 8007d1e4 00000000 8085a4ec b5383f8c b5383f68

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.495227] 3f60: 8007107c 80029550 b5383fb0 73cfc58c 20000010 ffffffff 10c5383d 10c5387d

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.503522] 3f80: b5383f9c 7498e034 80000010 7498e034 80000010 ffffffff 10c5383d 10c5387d

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.511817] 3fa0: 00000000 b5383fb0 805b90e4 800091ac 000000e0 6a6ff000 00a96780 6a6fefd0

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.520111] 3fc0: 000000f8 00000000 66bfed40 00000000 fffff800 00000000 00000000 66bfecfc

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.528405] 3fe0: 6a6feff0 66bfea80 7498df00 7498e034 80000010 ffffffff 00000000 00000000

Message from syslogd@raspberrypi at Sep 30 20:57:40 ...
 kernel:[ 3638.608041] Code: e8bd4000 e5913004 ee1dcf90 e3130001 (e5903148)

代码段:

import numpy as np
import os, sys
import glob
import argparse

from keras import __version__
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Input, Flatten, Dense, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.resnet50 import ResNet50, preprocess_input

... 
#Code skipped
...

    print ("Model compiled!") #Section causing the error begins here

    train_datagen = ImageDataGenerator(
        preprocessing_function=preprocess_input,
        rotation_range     = 30,
        width_shift_range  = 0.25,
        height_shift_range = 0.25,
        shear_range        = 0.25,
        zoom_range         = 0.25,
        horizontal_flip    = True
    )

    test_datagen = ImageDataGenerator(
        preprocessing_function=preprocess_input,
        rotation_range     = 30,
        width_shift_range  = 0.25,
        height_shift_range = 0.25,
        shear_range        = 0.25,
        zoom_range         = 0.25,
        horizontal_flip    = True
    )

    train_generator = train_datagen.flow_from_directory(
      args.train_dir,
      target_size=(IM_WIDTH, IM_HEIGHT),
      batch_size=batch_size,
    )

    validation_generator = test_datagen.flow_from_directory(
      args.val_dir,
      target_size=(IM_WIDTH, IM_HEIGHT),
      batch_size=batch_size,
    )

    print ("Training model...")

    history_tl = model_final.fit_generator(
      train_generator,
      validation_data  = validation_generator,
      class_weight     = 'auto',
      steps_per_epoch  = num_training_steps,
      epochs           = num_epochs,
      validation_steps = num_validation_steps)

    print ("Model training completed!")

此问题的解决方案是以不同的方式安装张量流吗?或者是否有一些更简单的解决方法?

1 个答案:

答案 0 :(得分:1)

分段错误通常与某些内存错误相关。这可能是因为RPI没有足够的内存。你可以尝试

sudo raspi-config

并为CPU分配更多的内存(为GPU分配更少的内存)。不过,我认为多余的64MB或32MB内存不会有所帮助。

我会避免在树莓派上训练任何深度学习模型。它根本不够强大。您也许可以训练一些非常简单的神经网络,但不能训练任何一个像样的神经网络。我强烈建议您在桌面上训练您的模型,然后将经过预训练的模型复制到树莓派中,并加载模型和权重,并且仅进行正向计算。

ResNet50是用于pi的相当大的模型。它包含超过20M的参数。在我的PI 3上运行单个前向预测花了8秒。在Pi上训练它没有太大意义。