Question

每当我尝试Apache spark数据分析的设置过程时，都会收到此错误。

在

def set_hadoop_config(credentials):
    prefix = "fs.swift.service." + credentials['name'] 
    hconf = sc._jsc.hadoopConfiguration()
    hconf.set(prefix + ".auth.url", credentials['auth_url']+'/v3/auth/tokens')
    hconf.set(prefix + ".auth.endpoint.prefix", "endpoints")
    hconf.set(prefix + ".tenant", credentials['project_id'])
    hconf.set(prefix + ".username", credentials['user_id'])
    hconf.set(prefix + ".password", credentials['password'])
    hconf.setInt(prefix + ".http.port", 8080)
    hconf.set(prefix + ".region", credentials['region'])
    hconf.setBoolean(prefix + ".public", True)

在

credentials['name'] = 'keystone'
set_hadoop_config(credentials)

停止

---------------------------------------------------------------------------
 NameError                                 Traceback (most recent call last)
<ipython-input-6-976c35e1d85e> in <module>()
----> 1 credentials['name'] = 'keystone'
      2 set_hadoop_config(credentials)

NameError: name 'credentials' is not defined

有谁知道如何解决这个问题？我被困了

Answer 1

我认为您缺少凭证字典，即您应该传递访问Object Store服务的参数值，如下所示：

credentials = 
{
  'auth_uri':'',
  'global_account_auth_uri':'',
  'username':'admin_b055482b7febbd287d9020d65cdd55f5653d0ffb',
  'password':"XXXXXX",
  'auth_url':'https://identity.open.softlayer.com',
  'project':'object_storage_e5e45537_ea14_4d15_b90a_5fdd271ea402',
  'project_id':'7d7e5f2a83fe47e586b91f459d47169f',
  'region':'dallas',
  'user_id':'001c394e06d74b86a76a786615e358e2',
  'domain_id':'2df6373c549e49f8973fb6d22ab18c1a',
  'domain_name':'639347',
  'filename':'2015_SQL.csv',
  'container':'notebooks',
  'tenantId':'s322-e1e9acad6196b9-a1259eb961e2'
 }

如果您使用的是笔记本电脑，可以使用＆＃34; Insert to Code＆＃34;对于数据源面板（右侧）下列出的文件。

要访问该文件，您需要Swift URI，如下所示：

raw_data = sc.textFile("swift://" + credentials['container'] + "." + credentials['name'] + "/" + credentials['filename'])

IBM Bluemix set_hadoop_config错误

1 个答案: