从Python中的Azure Blob下载XLSX文件

时间:2019-12-05 09:05:56

标签: python excel pandas azure download

from azure.storage.blob import BlockBlobService
block_blob_service = BlockBlobService(account_name=AZURE_ACCOUNT_NAME, account_key=AZURE_ACCOUNT_KEY)
file = block_blob_service.get_blob_to_bytes(AZURE_CONTAINER, "CS_MDMM_Global.xlsx")
file.content // the issue is at this line it give me data in some encoded form, i want to decode the data and store in panada data frame.

我从blob获取编码数据,但无法弄清楚如何将数据解码为PANDA DATAFRAME。

1 个答案:

答案 0 :(得分:0)

听起来您想通过xlsx读取存储在Azure Blob存储中的pandas blob文件的内容,以获取熊猫数据框。

我有一个xlsx示例文件存储在我的Azure Blob存储中,其内容如下图所示。

enter image description here

因此,我将通过适用于Python和pandas的Azure存储SDK直接阅读它,第一步是在下面安装这些程序包。

pip install pandas azure-storage xlrd

这是我的示例代码。

# Generate a url of excel blob with sas token
from azure.storage.blob.baseblobservice import BaseBlobService
from azure.storage.blob import BlobPermissions
from datetime import datetime, timedelta

account_name = '<your storage account name>'
account_key = '<your storage key>'
container_name = '<your container name>'
blob_name = '<your excel blob>'

blob_service = BaseBlobService(
    account_name=account_name,
    account_key=account_key
)

sas_token = blob_service.generate_blob_shared_access_signature(container_name, blob_name, permission=BlobPermissions.READ, expiry=datetime.utcnow() + timedelta(hours=1))
blob_url_with_sas = blob_service.make_blob_url(container_name, blob_name, sas_token=sas_token)

# pass the blob url with sas to function `read_excel`
import pandas as pd
df = pd.read_excel(blob_url_with_sas)
print(df)

结果是:

enter image description here

实际上,您的帖子问题与另一个SO线程Read in azure blob using python重复,有关更多详细信息,请参阅我的回答。