通过CodeBuild在AWS Lambda上安装NLTK / WORDNET

时间:2018-11-06 13:12:40

标签: amazon-web-services aws-lambda nltk aws-codebuild aws-codestar

我正在尝试通过CodeBuild使NLTK和Wordnet在lambda上运行。

看起来它在CloudFormation中安装得很好,但是在Lambda中出现以下错误:

START RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c Version: $LATEST
Unable to import module 'index': No module named 'nltk'

END RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c
REPORT RequestId: c660c446-e1c4-11e8-8047-15f59f1e002c  Duration: 2.10 ms   Billed Duration: 100 ms     Memory Size: 128 MB Max Memory Used: 21 MB  

但是,当我检查时,它在CodeBuild中安装得很好:

[Container] 2018/11/06 12:45:06 Running command pip install -U nltk
Collecting nltk
 Downloading https://files.pythonhosted.org/packages/50/09/3b1755d528ad9156ee7243d52aa5cd2b809ef053a0f31b53d92853dd653a/nltk-3.3.0.zip (1.4MB)
Requirement already up-to-date: six in /usr/local/lib/python2.7/site-packages (from nltk)
Building wheels for collected packages: nltk
 Running setup.py bdist_wheel for nltk: started
 Running setup.py bdist_wheel for nltk: finished with status 'done'
 Stored in directory: /root/.cache/pip/wheels/d1/ab/40/3bceea46922767e42986aef7606a600538ca80de6062dc266c
Successfully built nltk
Installing collected packages: nltk
Successfully installed nltk-3.3

这是实际的python代码:

import json
import datetime
import nltk
from nltk.corpus import wordnet as wn

这是YML文件:

version: 0.2

phases:
  install:
    commands:

      # Upgrade AWS CLI to the latest version
      - pip install --upgrade awscli

      # Install nltk & WordNet
      - pip install -U nltk
      - python -m nltk.downloader wordnet

  pre_build:
    commands:

      # Discover and run unit tests in the 'tests' directory. For more information, see <https://docs.python.org/3/library/unittest.html#test-discovery>
      # - python -m unittest discover tests

  build:
    commands:

      # Use AWS SAM to package the application by using AWS CloudFormation
      - aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

artifacts:
  type: zip
  files:
    - template-export.yml

有人知道为什么它可以在CodeBuild中正常安装,但无法访问Lambda中的模块NLTK吗?作为参考,如果仅删除NLTK,则代码在lambda中可以正常运行。

我感觉这是一个YML文件问题,但是由于NLTK安装良好,我不确定是什么原因。

2 个答案:

答案 0 :(得分:4)

NLTK仅在运行CodeBuild作业的计算机上本地安装。您需要将NLTK复制到CloudFormation部署包中。您的buildspec.yml看起来会像这样:

install:
  commands:

  # Upgrade AWS CLI to the latest version
  - pip install --upgrade awscli

pre_build:
  commands:
  - virtualenv /venv

  # Install nltk & WordNet
  - pip install -U nltk
  - python -m nltk.downloader wordnet

build:
  commands:
  - cp -r /venv/lib/python3.6/site-packages/. ./

  # Use AWS SAM to package the application by using AWS CloudFormation
  - aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

其他阅读内容:

答案 1 :(得分:2)

好的,感谢莱卡向我指出了正确的方向。

这是NLTK和Wordnet通过CodeStar / CodeBuild到Lambda的有效部署。注意事项:

1)您不能使用source venv/bin/activate,因为它不符合POSIX。改为使用. venv/bin/activate,如下所示。

2)您必须按照定义目录部分中所示设置NLTK的路径。

buildspec.yml

version: 0.2

phases:
  install:
    commands:

      # Upgrade AWS CLI & PIP to the latest version
      - pip install --upgrade awscli
      - pip install --upgrade pip

      # Define Directories
      - export HOME_DIR=`pwd`
      - export NLTK_DATA=$HOME_DIR/nltk_data

  pre_build:
    commands:
      - cd $HOME_DIR

      # Create VirtualEnv to package for lambda
      - virtualenv venv
      - . venv/bin/activate

      # Install Supporting Libraries
      - pip install -U requests

      # Install WordNet
      - pip install -U nltk
      - python -m nltk.downloader -d $NLTK_DATA wordnet

      # Output Requirements
      - pip freeze > requirements.txt

      # Unit Tests
      # - python -m unittest discover tests

  build:
    commands:
      - cd $HOME_DIR
      - mv $VIRTUAL_ENV/lib/python3.6/site-packages/* .

      # Use AWS SAM to package the application by using AWS CloudFormation
      - aws cloudformation package --template template.yml --s3-bucket $S3_BUCKET --output-template template-export.yml

artifacts:
  type: zip
  files:
    - template-export.yml

如果LMK有任何改进。它对我有用。