如何在Data Science Experience项目中创建与Bluemix上的对象存储的连接?

时间:2017-05-26 02:54:46

标签: python ibm-cloud object-storage data-science-experience

我正在尝试为与项目创建的默认项目不同的项目建立与Bluemix对​​象存储的连接。这是一个问题,因为:

1)当我去添加新连接时,我想要使用的对象存储实例不在数据服务之下。

2)当我去添加一个Softlayer对象存储时,我要求的凭据是(登录URL,访问密钥和密钥),但我的实例凭据是(“auth_url”:“项目 “:” 专案编号 “:” 区域 “:” 用户id “:” 用户名 “:” 密码 “:” DOMAINID “:” 则domainName “:” 角色“)

3)我对占位符对象存储有一个很好的接口,但我想用另一个实例替换它。

默认情况下,请帮助我访问其他Bluemix对​​象存储实例中的数据,而不是附加到项目的实例。

4 个答案:

答案 0 :(得分:2)

除了@Sumit Goyal回答了什么。 您需要在本地gpfs中下载该文件,以便使用不支持从快速对象存储读取的api或库,或者换言之,仅支持从本地存储/文件系统读取。

objStorCred = { " auth_url":" https://identity.open.softlayer.com", " project":" object_storage_XXXXX& #34;, " projectId":" XXXXX5a3", " region":" dallas", " userId":" XXXXXX98a15e0", "用户名":" admin_fXXXXX9",{{1} }"密码":" XXXXX", " domainId":" aXXXX5a", &#34 ; domainName":" XXXX", "角色":" admin" }

from io import StringIO import requests import json import pandas as pd

# @hidden_cell

# This function accesses a file in your Object Storage. The definition contains your credentials.

# You might want to remove those credentials before you share your notebook.

def get_object_storage_file(container, filename):

"""This functions returns a StringIO object containing
    the file content from Bluemix Object Storage."""

请注意,我们不是获取stringIO对象,而是获取响应对象。

现在您可以使用中间本地存储来存储.mat文件。

然后调用此函数。

url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens']) data = {'auth': {'identity': {'methods': ['password'], 'password': {'user': {'name': objStorCred['username'],'domain': {'id': objStorCred['domainId']}, 'password': objStorCred['password']}}}}} headers1 = {'Content-Type': 'application/json'} resp1 = requests.post(url=url1, data=json.dumps(data), headers=headers1) resp1_body = resp1.json() for e1 in resp1_body['token']['catalog']: if(e1['type']=='object-store'): for e2 in e1['endpoints']: if(e2['interface']=='public'and e2['region']=='dallas'): url2 = ''.join([e2['url'],'/', container, '/', filename]) s_subject_token = resp1.headers['x-subject-token'] headers2 = {'X-Auth-Token': s_subject_token, 'accept': 'application/json'} resp2 = requests.get(url=url2, headers=headers2) return resp2

r = get_object_storage_file("containerr1", "example.mat")

现在使用h5py读取文件。 您可能需要使用pip install h5py安装h5py。

with open('example.mat', 'wb') as file:  
file.write(r.content)

import h5py f = h5py.File('example.mat')

谢谢, 查尔斯。

答案 1 :(得分:1)

您可以使用insert to code功能生成的功能,并插入其他对象存储中的凭据。例如:

from io import StringIO
import requests
import json
import pandas as pd

# @hidden_cell
# This function accesses a file in your Object Storage. The definition contains your credentials.
# You might want to remove those credentials before you share your notebook.
def get_object_storage_file_with_credentials(container, filename):
"""This functions returns a StringIO object containing
the file content from Bluemix Object Storage."""

url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens'])
data = {'auth': {'identity': {'methods': ['password'],
        'password': {'user': {'name': 'admin_xxxx','domain': {'id': 'xxxxxxxxxxx'},
        'password': 'xxxxxxxxxx'}}}}}
headers1 = {'Content-Type': 'application/json'}
resp1 = requests.post(url=url1, data=json.dumps(data), headers=headers1)
resp1_body = resp1.json()
for e1 in resp1_body['token']['catalog']:
    if(e1['type']=='object-store'):
        for e2 in e1['endpoints']:
                    if(e2['interface']=='public'and e2['region']=='dallas'):
                        url2 = ''.join([e2['url'],'/', container, '/', filename])
s_subject_token = resp1.headers['x-subject-token']
headers2 = {'X-Auth-Token': s_subject_token, 'accept': 'application/json'}
resp2 = requests.get(url=url2, headers=headers2)
return StringIO(resp2.text)

在此处,从下一个Bluemix对​​象存储凭据中替换user namedomain idpassword的值。之后,您只需通过以下方式从该对象存储中的容器访问文件:

cars_df = pd.read_csv(get_object_storage_file_with_credentials('<containerName>', '<filename>.csv'))
cars_df.head()

答案 2 :(得分:1)

我强烈建议您查看https://github.com/ibm-cds-labs/ibmos2spark(适用于Python,R和Scala)。

对于Python + SoftLayer凭证,它特别是这段代码:

  

slos = ibmos2spark.softlayer(sc,configuration_name,auth_url,tenant,username,password)   data = sc.textFile(slos.url(container_name,object_name))

(摘自https://github.com/ibm-cds-labs/ibmos2spark/tree/master/python#softlayer

接下来的问题是如何加载.mat文件 - 这似乎是绕过Read .mat files in Python并使用“sc.binaryFiles()”将它们首先放入内存。

答案 3 :(得分:0)

在R:

## Credentials and libraries to write to object storage

## Install necessary library
install_github('IBMDataScience/objectStoreR')
library('objectStoreR')  

## Provide Credentials (fill in with your details from Bluemix)
credentials <-list(auth_url = "https://identity.open.softlayer.com",
         project = "object_storage_d7a568f8_ac53_4bc4_8834_f0e9962068f9",
         project_id = "e0c826f12030487493z2df3957621744",
         region = "dallas",
         user_id = "694102a676ef4252u19492c45fbebc4b",
         domain_id = "47ea410d2b51478d9f119fade708fbefe4",
         domain_name =  "1004827",
         username = "admin_9c5c874ed726b5a41c7bb4f8b55f45e3e2c35778",
         password = "Tj^d9rZoDhy5eb]U",
         container = "mycontainer", 
         filename = "myfile.csv")

将文件写入对象存储:

## Status '201' is a successful signal
write.csv(outputDF,'myOutputFile.csv', row.names = F)
status <- objectStore.put(credentials,'myOutputFile.csv')
paste("Status for final output CSV:", status, sep = " ")

同样,要保存模型对象(请注意,您必须在凭据列表中更改为文件名或创建第二个凭证变量):

saveRDS(object = finalMod, file = "myModel.rds")
status <- objectStore.put(credentials, "myModel.rds")
paste("Status for model object:", status, sep = " ")

希望这有帮助!