在Azure函数中安装雪花连接器python [pandas]

时间:2020-07-30 07:50:02

标签: python azure-functions snowflake-cloud-data-platform

我正在尝试使用python-snowflake连接器和pandas extras部署python代码,以使用VS Code扩展来使功能正常化。在本地运行该功能正常,并且部署本身也正常

我在部署过程中安装的依赖项的requirements.txt如下:

azure-functions
xlrd
numpy
pandas
azure-storage-blob
snowflake-connector-python

代码中的导入如下:

import numpy as np
import pandas as pd
import azure.storage.blob
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
import snowflake.connector
from snowflake.connector import DictCursor
#from snowflake.connector.pandas_tools import write_pandas

满足这些要求,一切正常。

问题是,我将requirements.txt更改为包括雪花连接器的熊猫附加组件(代码中必需)后,就立即出现:

azure-functions
xlrd
numpy
pandas
azure-storage-blob
snowflake-connector-python[pandas]

我在尝试执行该功能时收到以下错误消息:

Result: Failure Exception: KeyError: 'snowflake-connector-python' Stack: File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/dispatcher.py", line 262, in _handle__function_load_request func = loader.load_function( File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/utils/wrappers.py", line 32, in call return func(*args, **kwargs) File "/azure-functions-host/workers/python/3.8/LINUX/X64/azure_functions_worker/loader.py", line 76, in load_function mod = importlib.import_module(fullmodname) File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "<frozen importlib._bootstrap>", line 1014, in _gcd_import File "<frozen importlib._bootstrap>", line 991, in _find_and_load File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 671, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 783, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/site/wwwroot/HttpTrigger1/__init__.py", line 23, in <module> import snowflake.connector File "/home/site/wwwroot/.python_packages/lib/site-packages/snowflake/connector/__init__.py", line 17, in <module> from .connection import SnowflakeConnection File "/home/site/wwwroot/.python_packages/lib/site-packages/snowflake/connector/connection.py", line 47, in <module> from .cursor import SnowflakeCursor, LOG_MAX_QUERY_LENGTH File "/home/site/wwwroot/.python_packages/lib/site-packages/snowflake/connector/cursor.py", line 48, in <module> from .arrow_result import ArrowResult File "arrow_result.pyx", line 16, in init snowflake.connector.arrow_result File "/home/site/wwwroot/.python_packages/lib/site-packages/snowflake/connector/options.py", line 35, in <module> _pandas_extras = pkg_resources.working_set.by_key['snowflake-connector-python']._dep_map['pandas']

有关如何正确运行功能的任何帮助或想法?谢谢!

1 个答案:

答案 0 :(得分:1)

这似乎是熊猫工具中的错误,我遇到了同样的问题。根据您打算使用熊猫工具的不同,您也许可以解决它。

当我查询Snowflake时,我的函数依赖于fetch_pandas_all()函数,但是我能够通过仅安装基本的雪花连接器python程序包并使用直接安装的pandas来解决此问题,然后将代码更改为使用以下内容:

import logging
import snowflake.connector
import pandas as pd

class SnowflakeReader(object):
    def __init__(self, accnt: str, warehouse: str, dbuser: str, pw: str, db: str, dbschema: str):
    self.accnt = accnt
    self.warehouse = warehouse
    self.db = db
    self.dbschema = dbschema
    self.dbuser = dbuser
    self.pw = pw


    def form_connection(self):
        try:
            ctx = snowflake.connector.connect(
                user = self.dbuser,
                password = self.pw,
                account = self.accnt,
                warehouse = self.warehouse,
                database = self.db,
                schema = self.dbschema
            )
        except Exception as e:
            logging.error('Failed to connect to Snowflake: ' + str(e))
        return ctx

    def read_all(self, query: str, conn) -> pd.DataFrame:
        self.query = query
        self.conn = conn

        try:
            cs = conn.cursor()
            sql_cmd = self.query
            cs.execute(sql_cmd)
            result = cs.fetchall()
            result_df = pd.DataFrame(result)
        except Exception as e:
            logging.error('Failed to read from Snowflake: ' + str(e))

        return result_df

您似乎正在尝试导入write_pandas,这使我认为您希望将DataFrame直接写入Snowflake中的表。在这种情况下,您可以使用较旧的基于SQLAlchemy的方法来构建类似的解决方法,或者将DataFrame转换为dict并使用execute()函数传递给insert语句。