通过python API列出azure blob存储中的虚拟文件夹

时间:2015-11-02 12:33:35

标签: python azure azure-storage-blobs

我正在阅读this tutorial但我无法找到一种方法来列出容器下的所有(虚拟)文件夹而不获取所有文件。我在500(虚拟)文件夹中有26K文件。我只想获取文件夹列表,而不必等待几分钟,以获得包含整个文件列表的list_blobs的输出。有没有办法做到这一点?或者至少告诉list_blobs不要超过集装箱下面的n级别?

3 个答案:

答案 0 :(得分:3)

您可以尝试以下内容:

from azure.storage import BlobService

blob_service = BlobService(account_name='account-name', account_key='account-key')

bloblistingresult = blob_service.list_blobs(container_name='container-name', delimiter='/') 
for i in bloblistingresult.prefixes:
        print(i.name) #this will print the names of the virtual folders

SDK Source Code Reference: BlobService.list_blobs()
SKD Source Code Reference: BlobService.list_blobs().prefixes

答案 1 :(得分:1)

@ Gaurav Mantri指出了获取BlobPrefix元素列表的正确方法,我们可以利用它创建一个函数来满足您的要求:

例如,我在目录中有4个级别:

import azure
from azure.storage.blob import BlobService
blob_service = BlobService(account_name='<account_name>', account_key='<account_key>')
def getfolders(depth=1):
    result = []
    searched = []
    delimiter = '/'
    print depth
    blob_list = blob_service.list_blobs('container-name',delimiter='/')
    result.extend(str(l.name) for l in blob_list.prefixes)
    #for l in blob_list.prefixes:
    #    result.extend(str(l.name))
    depth -= 1 
    while (depth>0):
        print 'result: \n'
        print ','.join(str(p) for p in result)
        print 'searched: \n'
        print ','.join(p for p in searched)
        for p in [item for item in result if item not in searched]:
            print p +' in '+ str(depth)
            searched.append(p)
            blob_list = blob_service.list_blobs('vsdeploy',prefix=p,delimiter='/')
            result.extend(str(l.name) for l in blob_list.prefixes)
        depth -= 1 
    return result

blob_list = getfolders(4)
print ','.join(str(p) for p in blob_list)

答案 2 :(得分:1)

也许对于2some来说还不算太晚。 list_blobs不接受delimiter参数。而是使用walk_blobsdoc)来获取包含文件的生成器。使用delimiter="/",您将获得文件/文件夹的下一个子级别:

例如:

blob_service_client = BlobServiceClient.from_connection_string(file_connect_str)
container_client = blob_service_client.get_container_client(container_name)
for file in container_client.walk_blobs('my_folder/', delimiter='/'):
    print(file.name)

将返回:

"my_folder/sub_folder_1/"
"my_folder/sub_folder_2/"