使用python更快地搜索Azure blob名称吗?

时间:2018-07-04 05:52:55

标签: python python-3.x azure azure-storage-blobs

我有一个需要在Azure上搜索的文件名列表。现在,作为一个菜鸟,我遍历每个Blob名称并比较字符串,但我认为必须有最简单,最快速的方法来完成此操作。当前的解决方案使我的HTTP响应非常慢。

def ifblob_exists(self, filename):
        try:
            container_name = 'xxx'
            AZURE_KEY = 'xxx'
            SAS_KEY = 'xxx'
            ACCOUNT_NAME = 'xxx'
            block_blob_service = BlockBlobService(account_name= ACCOUNT_NAME, account_key= None, sas_token = SAS_KEY, socket_timeout= 10000)

            generator = block_blob_service.list_blobs(container_name)
            for blob in generator:
                if filename == blob.name:
                    print("\t Blob exists :"+" "+blob.name)
                    return True
                else:
                    print('Blob does not exists '+filename)
                    return False
        except Exception as e:
            print(e)

2 个答案:

答案 0 :(得分:1)

请在Azure存储python sdk中使用exists方法。

def ifblob_exists(filename):
    try:
        container_name = '***'

        block_blob_service = BlockBlobService(account_name=accountName, account_key=accountKey,
                                              socket_timeout=10000)

        isExist = block_blob_service.exists(container_name, filename)
        if isExist:
            print("\t Blob exists :" + " " + filename)
        else:
            print("\t Blob exists :" + " " + filename)

当然,如果您有文件名列表,则至少需要循环调用上面的函数。

希望它对您有帮助。

答案 1 :(得分:0)

列出所有Blob在Azure存储基础架构中非常昂贵,因为这会转化为全面扫描。

在下面的示例中查找,以有效地检查blob(例如您的情况下的文件名)在给定容器中是否存在:

from azure.storage.blob import BlockBlobService
from datetime import datetime

def check_if_blob_exists(container_name: str, blob_names: []):
    start_time = datetime.now()

    if not container_name or container_name.isspace():
        raise ValueError("Container name cannot be none, empty or whitespace.")

    if not blob_names:
        raise ValueError("Block blob names cannot be none.")

        block_blob_service = BlockBlobService(account_name="{Storage Account Name}", account_key="{Storage Account Key}")

    for blob_name in blob_names:
        if block_blob_service.exists(container_name, blob_name):
            print("\nBlob '{0}' found!".format(blob_name));
        else:
            print("\nBlob '{0}' NOT found!".format(blob_name));

    end_time = datetime.now()

    print("\n***** Elapsed Time => {0} *****".format(end_time - start_time))

if __name__ == "__main__":
    blob_names = []

    # Exists
    blob_names.append("eula.1028.txt")
    blob_names.append("eula.1031.txt")
    blob_names.append("eula.1033.txt")
    blob_names.append("eula.1036.txt")
    blob_names.append("eula.1040.txt")

    # Don't exist
    blob_names.append("blob1")
    blob_names.append("blob2")
    blob_names.append("blob3")
    blob_names.append("blob4")

    check_if_blob_exists("containername", blob_names)

在下面的屏幕快照中找到我在美国西部的笔记本电脑进行的快速执行测试的屏幕截图(根据Google速度测试,下载速度约为150 Mbps,上传速度约为3.22 Mbps),检查在美国西部的LRS存储帐户中是否存在9个斑点好吧。

enter image description here