每当我尝试Apache spark数据分析的设置过程时,都会收到此错误。
在
def set_hadoop_config(credentials):
prefix = "fs.swift.service." + credentials['name']
hconf = sc._jsc.hadoopConfiguration()
hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens')
hconf.set(prefix + ".auth.endpoint.prefix", "endpoints")
hconf.set(prefix + ".tenant", credentials['project_id'])
hconf.set(prefix + ".username", credentials['user_id'])
hconf.set(prefix + ".password", credentials['password'])
hconf.setInt(prefix + ".http.port", 8080)
hconf.set(prefix + ".region", credentials['region'])
hconf.setBoolean(prefix + ".public", True)
在
credentials['name'] = 'keystone'
set_hadoop_config(credentials)
停止
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-976c35e1d85e> in <module>()
----> 1 credentials['name'] = 'keystone'
2 set_hadoop_config(credentials)
NameError: name 'credentials' is not defined
有谁知道如何解决这个问题?我被困了
答案 0 :(得分:1)
我认为您缺少凭证字典,即您应该传递访问Object Store服务的参数值,如下所示:
credentials =
{
'auth_uri':'',
'global_account_auth_uri':'',
'username':'admin_b055482b7febbd287d9020d65cdd55f5653d0ffb',
'password':"XXXXXX",
'auth_url':'https://identity.open.softlayer.com',
'project':'object_storage_e5e45537_ea14_4d15_b90a_5fdd271ea402',
'project_id':'7d7e5f2a83fe47e586b91f459d47169f',
'region':'dallas',
'user_id':'001c394e06d74b86a76a786615e358e2',
'domain_id':'2df6373c549e49f8973fb6d22ab18c1a',
'domain_name':'639347',
'filename':'2015_SQL.csv',
'container':'notebooks',
'tenantId':'s322-e1e9acad6196b9-a1259eb961e2'
}
如果您使用的是笔记本电脑,可以使用&#34; Insert to Code&#34;对于数据源面板(右侧)下列出的文件。
要访问该文件,您需要Swift URI,如下所示:
raw_data = sc.textFile("swift://" + credentials['container'] + "." + credentials['name'] + "/" + credentials['filename'])