我在C驱动器中有h5文件。我无法在Azure的h5中上传数据集,因为它只有468 MB。我如何从代码本身读取它。在没有Azure的情况下,在本地计算机上安装了jupyter笔记本,我可以使用以下代码进行访问:
使用h5py.File('SVHN_single_grey1.h5','r')作为hdf:
这在Azure中不起作用,因为它无法访问计算机上的本地文件。
答案 0 :(得分:0)
如果可以通过URL直接从Internet访问H5文件,则可以尝试使用下面的代码在Azure Notebook中读取它。
import requests
from io import BytesIO
import h5py
r = requests.get("<an url for accessing your H5 file, such as https://host:port/.../SVHN_single_grey1.h5>")
f = BytesIO(r.content)
with h5py.File(f) as hdf:
...
如果没有,则必须首先将H5文件作为资源URL发布到Internet服务,然后通过上面的代码使用它。我建议使用Azure官方工具azcopy
来帮助将其作为Blob上载到Azure Blob存储,请参阅官方教程Tutorial: Migrate on-premises data to cloud storage by using AzCopy
以了解更多详细信息。然后,您可以按照下面的示例代码再次阅读。
from azure.storage.blob.baseblobservice import BaseBlobService
from azure.storage.blob import BlobPermissions
from datetime import datetime, timedelta
import requests
from io import BytesIO
import h5py
account_name = '<your account name>'
account_key = '<your account key>'
container_name = '<your container name>'
blob_name = '<your blob name>'
blob_service = BaseBlobService(
account_name=account_name,
account_key=account_key
)
sas_token = blob_service.generate_blob_shared_access_signature(container_name, blob_name, permission=BlobPermissions.READ, expiry=datetime.utcnow() + timedelta(hours=1))
# print(sas_token)
url_with_sas = blob_service.make_blob_url(container_name, blob_name, sas_token=sas_token)
# print(url_with_sas)
r = requests.get(url_with_sas)
f = BytesIO(r.content)
with h5py.File(f) as hdf:
...
或者,这是另一个也可以正常工作的示例代码。
from azure.storage.blob.baseblobservice import BaseBlobService
from io import BytesIO
import h5py
account_name = '<your account name>'
account_key = '<your account key>'
container_name = '<your container name>'
blob_name = '<your blob name>'
blob_service = BaseBlobService(
account_name=account_name,
account_key=account_key
)
stream = BytesIO()
blob_service.get_blob_to_stream(container_name, blob_name, stream)
with h5py.File(stream) as hdf:
...