在AWS Lambda上使用python中的Pdf2img将pdf页面转换为图像

时间:2019-11-13 10:55:13

标签: python aws-lambda

Lambda handler代码:

from pdf2image import convert_from_path, convert_from_bytes


def lambda_handler(event, context):
    # TODO implement
    f = "967.pdf"
    images = convert_from_path(f,dpi=150)

    return {
        'statusCode': 200,
        'body': images
    }

我遇到了错误-

   {
     "errorMessage": "Unable to get page count. Is poppler installed and in 
                     PATH?",
     "errorType": "PDFInfoNotInstalledError",
     "stackTrace": [
       "  File \"/var/task/lambda_function.py\", line 15, in 
       lambda_handler\n    images = 
       convert_from_path(f,dpi=150,poppler_path=poppler_path)\n",
       "  File \"/opt/python/pdf2image/pdf2image.py\", line 80, in 
       convert_from_path\n    page_count = _page_count(pdf_path, userpw, 
       poppler_path=poppler_path)\n",
       "  File \"/opt/python/pdf2image/pdf2image.py\", line 355, in 
       _page_count\n    \"Unable to get page count. Is poppler installed 
       and in PATH?\"\n"
    ]
   }

1 个答案:

答案 0 :(得分:0)

Lappda上未安装Poppler,您必须在部署期间将其打包。由于这引起了很多麻烦,因此我为该过程创建了一个存储库:

https://github.com/Belval/pdf2image-as-a-service

如果由于某些原因您不想使用以上内容,请按照以下一般步骤进行构建,并将poppler包含在软件包中:

  1. 构建poppler
  2. 移动bin /目录,而libpoppler是软件包中的特定目录
  3. 编辑代码以使用poppler_path

同样,您也可以只阅读as-a-function/amazon/lambda.sh

中的脚本