我需要逐行从blob存储中读取文本文件,并执行一些操作,并在数据帧中添加特定行。我尝试了各种方式逐行读取文件。有什么方法可以从blob line-line读取文本文件,并执行操作并输出特定行,就像readlines()一样,而数据仍在本地存储中?
candidate_resume = 'candidateresumetext'
block_blob_service = BlockBlobService(account_name='nam', account_key='key')
generator2 = block_blob_service.list_blobs(candidate_resume)
#for blob in generator2:
#print(blob.name)
for blob in generator2:
blob2 = block_blob_service.get_blob_to_text(candidate_resume,blob.name)
#print(blob2)
#blob_url=block_blob_service.make_blob_url(candidate_resume, blob.name)
#print(blob_url)
#blob3 = block_blob_service.get_blob_to_stream(candidate_resume,blob.name,range)
blob3 = blob2.split('.')
with open(blob.name,encoding = 'utf-8') as file:
lines = file.readlines()
for line in blob3:
if any(p in years_list for p in line ):
if any(p in months_list for p in line):
print(line)
答案 0 :(得分:0)
方法get_blob_to_text
是正确的方法,您可以按照下面的示例代码进行(如果不满足需要,可以进行一些更改)。而且您无法使用with open() as file
,因为那里没有真实的文件。
#read the content of the blob(assume it's a .txt file)
str1 = block_blob_service.get_blob_to_text(container_name,blob_name)
#split the string str1 with newline.
arr1 = str1.content.splitlines()
#read the one line each time.
for a1 in arr1:
print(a1)