从Blob存储读取Excel文件到Azure函数

时间:2020-09-08 01:31:59

标签: python azure azure-functions azure-storage-blobs azure-function-app

我正在使用以下绑定在Azure函数中创建函数:

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "msg",
      "type": "queueTrigger",
      "direction": "in",
      "queueName": "payroll-excel",
      "connection": "Payroll_test_connection"
    },
    {
      "name": "inputblob",
      "type": "blob",
      "path": "payroll-excel/{queueTrigger}",
      "connection": "Payroll_test_connection",
      "direction": "in"
    }
  ]
}

该函数由队列中的一条消息触发,该消息包含位于Blob容器中的Excel文件的路径。然后,我的 init .py如下:

import logging
import pandas as pd
import os,io
import azure.functions as func


def main(msg: func.QueueMessage,
        inputblob: func.InputStream) -> None:
    
    filename=msg.get_body().decode('utf-8')
    logging.info('Python queue trigger function processed a queue item: %s',
                 filename)
    
    logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {inputblob.name}\n"
                 f"Blob Size: {inputblob.length} bytes")
    Raw_file = inputblob.read()
    logging.info (type(Raw_file))
    df = pd.read_excel(io.BytesIO(Raw_file))
    logging.info (df.head())

我试图读取excel文件以在程序中使用它,但是出现以下错误:

System.Private.CoreLib: Result: Failure
Exception: BadZipFile: Bad magic number for central directory
Stack:   File "/usr/lib/azure-functions-core-tools/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 312, in _handle__invocation_request
self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/azure-functions-core-tools/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 431, in __run_sync_func
return func(**params)
File "/mnt/c/repos////__init__.py", line 19, in main
df = pd.read_excel(io.BytesIO(Raw_file))
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/util/_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 867, in __init__
self._reader = self._engines[engine](self._io)
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/io/excel/_xlrd.py", line 22, in __init__
super().__init__(filepath_or_buffer)
 File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/io/excel/_base.py", line 351, in __init__
self.book = self.load_workbook(filepath_or_buffer)
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/pandas/io/excel/_xlrd.py", line 35, in load_workbook
return open_workbook(file_contents=data)
File "/mnt/c/repos///.venv/lib/python3.6/site-packages/xlrd/__init__.py", line 115, in open_workbook
zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
File "/usr/lib/python3.6/zipfile.py", line 1131, in __init__
self._RealGetContents()
File "/usr/lib/python3.6/zipfile.py", line 1226, in _RealGetContents
raise BadZipFile("Bad magic number for central directory")

关于我需要做什么的任何想法?

谢谢

1 个答案:

答案 0 :(得分:0)

我在本地对其进行测试,结果似乎没有问题,并且我的代码中也没有异常。

enter image description here

我在Google上搜索了此异常,然后发现该异常通常是由于导入了旧的pyc文件而引起的,通常是由于版本不兼容所致。

解决方案

您可以删除旧的pyc文件,然后重新编译py文件:

查找。 -name'* .pyc'-删除

我找到了一个帖子What's the bad magic number error?,您可以了解有关错误的魔术数字例外的更多信息。

如果这不能解决您的问题,请提供更多信息以供我们分析。