有人可以告诉我是否可以直接从Azure blob存储中读取csv文件作为流并使用Python处理它?我知道它可以使用C#.Net(如下所示)完成,但想知道Python中的等效库来做到这一点。
CloudBlobClient client = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference("outfiles");
CloudBlob blob = container.GetBlobReference("Test.csv");*
答案 0 :(得分:5)
是的,当然可以这样做。查看Azure Storage SDK for Python
from azure.storage.blob import BlockBlobService
block_blob_service = BlockBlobService(account_name='myaccount', account_key='mykey')
block_blob_service.get_blob_to_path('mycontainer', 'myblockblob', 'out-sunset.png')
您可以在此处阅读完整的SDK文档:http://azure-storage.readthedocs.io。
答案 1 :(得分:2)
在此处提供您的Azure订阅Azure存储名称和密钥作为帐户密钥
block_blob_service = BlockBlobService(account_name='$$$$$$', account_key='$$$$$$')
这仍然会引起斑点,并保存在当前位置,作为“ output.jpg”
block_blob_service.get_blob_to_path('you-container_name', 'your-blob', 'output.jpg')
这将从blob获取文本/项目
blob_item= block_blob_service.get_blob_to_bytes('your-container-name','blob-name')
blob_item.content
答案 2 :(得分:0)
可以使用python这样从blob流式传输:
from tempfile import NamedTemporaryFile
from azure.storage.blob.blockblobservice import BlockBlobService
entry_path = conf['entry_path']
container_name = conf['container_name']
blob_service = BlockBlobService(
account_name=conf['account_name'],
account_key=conf['account_key'])
def get_file(filename):
local_file = NamedTemporaryFile()
blob_service.get_blob_to_stream(container_name, filename, stream=local_file,
max_connections=2)
local_file.seek(0)
return local_file
答案 3 :(得分:0)
这是使用new version of the SDK(12.0.0)的一种方法:
from azure.storage.blob import BlobClient
blob = BlobClient(account_url="https://<account_name>.blob.core.windows.net"
container_name="<container_name>",
blob_name="<blob_name>",
credential="<account_key>")
with open("example.csv", "wb") as f:
data = blob.download_blob()
data.readinto(f)
有关详细信息,请参见here。
答案 4 :(得分:0)
我建议使用smart_open。
from smart_open import open
# stream from Azure Blob Storage
with open('azure://my_container/my_file.txt') as fin:
for line in fin:
print(line)
# stream content *into* Azure Blob Storage (write mode):
with open('azure://my_container/my_file.txt', 'wb') as fout:
fout.write(b'hello world')
答案 5 :(得分:0)
以下是使用 Pandas 从 Blob 读取 CSV 的简单方法:
x86_64-unknown-linux-gnu
service_client = BlobServiceClient.from_connection_string(os.environ['AZURE_STORAGE_CONNECTION_STRING'])
client = service_client.get_container_client("your_container")
bc = client.get_blob_client(blob="your_folder/yourfile.csv")
with open("yourfile.csv", 'wb') as file:
data = bc.download_blob()
file.write(data.readall())
答案 6 :(得分:0)
我知道这是一个旧帖子,但如果有人想这样做。 我可以按照以下代码访问
注意:您需要设置AZURE_STORAGE_CONNECTION_STRING,它可以从Azure Portal 获得-> 转到您的存储-> 设置-> 访问密钥,然后您将在那里获得连接字符串。
对于 Windows: setx AZURE_STORAGE_CONNECTION_STRING ""
对于 Linux: 导出 AZURE_STORAGE_CONNECTION_STRING=""
对于 macOS: 导出 AZURE_STORAGE_CONNECTION_STRING=""
import os
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__
connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')
print(connect_str)
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client("Your Storage Name Here")
try:
print("\nListing blobs...")
# List the blobs in the container
blob_list = container_client.list_blobs()
for blob in blob_list:
print("\t" + blob.name)
except Exception as ex:
print('Exception:')
print(ex)