如何从dockerized jupyter笔记本连接到dockerized hdfs?

时间:2017-12-27 17:53:55

标签: docker jupyter

我用

在本地开始hadoop-spark-workbench

vi docker-compose.yml

暴露的端口8020和9000因此添加了以下暴露给这些端口:

volumes:
  - ./data/namenode:/hadoop/dfs/name
environment:
  - CLUSTER_NAME=test
env_file:
  - ./hadoop.env
ports:
  - 50070:50070
  - 8020:8020
  - 9000:9000

然后用:

启动集群

docker compose-up

一切都结束了

然后我在本地开始jupyter all spark notebook

docker run -it --rm -p 8888:8888 jupyter/all-spark-notebook

我正在尝试这个scala笔记本:

import org.apache.spark.{SparkContext, SparkConf}
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import java.net.URI
import org.apache.hadoop.conf.Configuration

val configuration = new Configuration();
val fs = FileSystem.get(new URI("hdfs://localhost:8020"), configuration);
val status = fs.listStatus(new Path("hdfs://localhost:8020/"))
status.foreach(x=> println(x.getPath))

我收到以下错误:

Name: java.net.ConnectException
Message: Call From 47d409a146ee/172.17.0.2 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
StackTrace:   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
  at org.apache.hadoop.ipc.Client.call(Client.java:1479)
  at org.apache.hadoop.ipc.Client.call(Client.java:1412)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
  at com.sun.proxy.$Proxy20.getListing(Unknown Source)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:573)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
  at com.sun.proxy.$Proxy21.getListing(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2086)
  at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2069)
  at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:791)
  at org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106)
  at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853)
  at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849)
  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860)
  ... 40 elided
Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
  at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
  at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
  at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
  at org.apache.hadoop.ipc.Client.call(Client.java:1451)

请注意telnet显示:

$ telnet localhost 8020
Trying ::1...
Connected to localhost.
Escape character is '^]'.

如何正确连接这个jupyter-notebook到我的本地hadoop集群hdfs?

0 个答案:

没有答案