AWS Lambda在下载VGG16 imagenet模型数据时超时

时间:2018-10-24 06:45:29

标签: keras deep-learning aws-lambda

我们有一个lambda函数,我们将使用Keras从图像中提取特征。因此,当lambda函数尝试下载VGG16 imagenet模型时,它变得超时了。

lambda超时的代码:-

from keras.applications.vgg16 import VGG16
model = VGG16(weights='imagenet', include_top=True)

如何解决此问题?我们可以登录到lambda函数的后端容器并下载模型吗?

Lambda代码:-

import os
import shutil
import stat
import zipfile
import boto3
from six.moves import urllib


s3 = boto3.client('s3')

def download(url, local_fpath):
    print('downloading other files........')
    s3.download_file('HARDCODED_BUCKET', url,local_fpath)

def make_gcc_executable():
    for fpath in os.listdir("/tmp/gcc/bin"):
        fpath = os.path.join("/tmp/gcc/bin", fpath)
        st = os.stat(fpath)
        os.chmod(fpath, st.st_mode | stat.S_IXOTH | stat.S_IXGRP | stat.S_IXUSR)

    for fpath in os.listdir("/tmp/gcc/libexec/gcc/x86_64-linux-gnu/4.6.4"):
        fpath = os.path.join("/tmp/gcc/libexec/gcc/x86_64-linux-gnu/4.6.4", fpath)
        st = os.stat(fpath)
        os.chmod(fpath, st.st_mode | stat.S_IXOTH | stat.S_IXGRP | stat.S_IXUSR)

# Download GCC and uncompress it.
download('test/imglib-new/gcc.zip', "/tmp/gcc.zip")
zipfile.ZipFile("/tmp/gcc.zip").extractall("/tmp/gcc")

make_gcc_executable()



from keras.preprocessing import image
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np

def _get_model2():
    print('downloadin our model....')
    s3.download_file('BUCKET','test/model/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5','/tmp/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5')

    #model = VGG16(weights='imagenet', include_top=False)
    print('model download is done')
    model = VGG16(weights='/tmp/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5')
    return model

model = _get_model2()



def handler(event, context):
    print('entering the function..downloading the images')
    download_loc='/tmp/dog2.jpg'
    s3.download_file('BUCKET', 'test/images/dog2.jpg',download_loc)
    print('image is downloaded')



    print('running the model now..')

    img = image.load_img(download_loc, target_size=(224, 224))
    img_data = image.img_to_array(img)
    img_data = np.expand_dims(img_data, axis=0)
    img_data = preprocess_input(img_data)
    print('extracting the feature...')
    vgg16_feature = model.predict(img_data)
    print('over..')
    print(vgg16_feature.shape)

错误消息:-

START RequestId: 00827d13-d7a4-11e8-9ea2-c5365d4fbfcb Version: $LATEST
module initialization error: Compilation failed (return status=1): /tmp/.theano/compiledir_Linux-4.14-amzn1.x86_64-x86_64-with-glibc2.2.5-x86_64-3.6.1-64/lazylinker_ext/mod.cpp:1:20: fatal error: Python.h: No such file or directory. compilation terminated.. 

END RequestId: 00827d13-d7a4-11e8-9ea2-c5365d4fbfcb
REPORT RequestId: 00827d13-d7a4-11e8-9ea2-c5365d4fbfcb  Duration: 153.15 ms Billed Duration: 200 ms     Memory Size: 832 MB Max Memory Used: 225 MB 
module initialization error
Compilation failed (return status=1): /tmp/.theano/compiledir_Linux-4.14-amzn1.x86_64-x86_64-with-glibc2.2.5-x86_64-3.6.1-64/lazylinker_ext/mod.cpp:1:20: fatal error: Python.h: No such file or directory. compilation terminated.. 

1 个答案:

答案 0 :(得分:0)

Lambda不允许您登录到容器。您没有提到模型的大小,但是有一些选择:

  • 一个lambda部署最多可以达到50mb,因此您很可能可以将模型作为代码包的一部分进行上传,而不必自己下载。
  • 如果您必须下载模型,则将超时时间延长至整整15分钟。在任何情况下,花费比下载更多时间的文件都不可能放入lambda的RAM中。
  • 考虑下载源。从S3到Lambda的下载速度为50mb / s,因此将模型上传到S3可以使较大的下载速度更快。
  • 确保您下载的内容不在处理程序功能的范围内-这样一来,您仅会在Lambda容器首次加载时花时间下载模型。