我已经编写了Python脚本来自动构建Azure VM并从KVM上传到Azure,但是我遇到了无法解决的问题。 构建完虚拟机后,我正尝试使用Azure Python模块将磁盘上传到Azure,问题是该脚本实际上占用了所有可用的RAM。我尝试了几种编码方式,但总是以相同的结果结束。
block_blob_service = BlockBlobService(vars.az_storage_acc_name, vars.az_sto_key)
blob = open(args.pool_path + args.name + "-az"+'.vhd', 'r')
print "Upload {} to Azure Blob service".format(args.name +"-az"+'.vhd')
block_blob_service.create_blob_from_stream(vars.az_cnt, args.name +"-az"+'.vhd', blob)
我也尝试了以下方法:
stream = io.open('/path_to_vhd', 'rb')
BlockBlobService.create_blob_from_stream(vars.az_cnt, "test-stream.vhd", stream)
运气不好,每次启动Blob创建时都会失败,但由于没有可用的RAM而最终失败。
您是否有机会让我做到这一点?
答案 0 :(得分:0)
这将需要将整个流保存在内存中,除非您的计算机中没有最大RAM大小,否则此代码将不起作用,并且在某些时候会给您系统内存不足的异常。
我建议您分块上传流,而不要一次性写入。
此处是用于分块上传流的功能
def _upload_blob_chunks(blob_service, container_name, blob_name,
blob_size, block_size, stream, max_connections,
progress_callback, validate_content, lease_id, uploader_class,
maxsize_condition=None, if_modified_since=None, if_unmodified_since=None, if_match=None,
if_none_match=None, timeout=None,
content_encryption_key=None, initialization_vector=None, resource_properties=None):
encryptor, padder = _get_blob_encryptor_and_padder(content_encryption_key, initialization_vector,
uploader_class is not _PageBlobChunkUploader)
uploader = uploader_class(
blob_service,
container_name,
blob_name,
blob_size,
block_size,
stream,
max_connections > 1,
progress_callback,
validate_content,
lease_id,
timeout,
encryptor,
padder
)
uploader.maxsize_condition = maxsize_condition
# Access conditions do not work with parallelism
if max_connections > 1:
uploader.if_match = uploader.if_none_match = uploader.if_modified_since = uploader.if_unmodified_since = None
else:
uploader.if_match = if_match
uploader.if_none_match = if_none_match
uploader.if_modified_since = if_modified_since
uploader.if_unmodified_since = if_unmodified_since
if progress_callback is not None:
progress_callback(0, blob_size)
if max_connections > 1:
import concurrent.futures
from threading import BoundedSemaphore
'''
Ensures we bound the chunking so we only buffer and submit 'max_connections' amount of work items to the executor.
This is necessary as the executor queue will keep accepting submitted work items, which results in buffering all the blocks if
the max_connections + 1 ensures the next chunk is already buffered and ready for when the worker thread is available.
'''
chunk_throttler = BoundedSemaphore(max_connections + 1)
executor = concurrent.futures.ThreadPoolExecutor(max_connections)
futures = []
running_futures = []
# Check for exceptions and fail fast.
for chunk in uploader.get_chunk_streams():
for f in running_futures:
if f.done():
if f.exception():
raise f.exception()
else:
running_futures.remove(f)
chunk_throttler.acquire()
future = executor.submit(uploader.process_chunk, chunk)
# Calls callback upon completion (even if the callback was added after the Future task is done).
future.add_done_callback(lambda x: chunk_throttler.release())
futures.append(future)
running_futures.append(future)
# result() will wait until completion and also raise any exceptions that may have been set.
range_ids = [f.result() for f in futures]
else:
range_ids = [uploader.process_chunk(result) for result in uploader.get_chunk_streams()]
if resource_properties:
resource_properties.last_modified = uploader.last_modified
resource_properties.etag = uploader.etag
return range_ids
作为参考,您可以浏览下面的线程
此外,对于相同类型的请求,也存在类似的线程
how to transfer file to azure blob storage in chunks without writing to file using python
或者,您可以使用powershell将VHD上载到vm存储帐户,如下所示
$rgName = "myResourceGroup"
$urlOfUploadedImageVhd = "https://mystorageaccount.blob.core.windows.net/mycontainer/myUploadedVHD.vhd"
Add-AzVhd -ResourceGroupName $rgName -Destination $urlOfUploadedImageVhd `
-LocalFilePath "C:\Users\Public\Documents\Virtual hard disks\myVHD.vhd"
这里是相同的参考
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/upload-generalized-managed
希望有帮助。
答案 1 :(得分:0)
谢谢您的输入。
我不明白的是,最后
有什么区别block_blob_service.create_blob_from_stream
和
block_blob_service.create_blob_from_path
如果它试图将所有内容保存在RAM中?