Question

我能够在AWS EC2上创建单节点hadoop集群。我在那里保存了一个文件（sample.txt），我试图通过我的应用程序远程访问。但是，连接被拒绝。

我可以在http://ec2-...:50070上查看该文件，一切似乎都运行良好。

EC2的安全性设置为

输入|协议|港口范围|目的地/源

所有流量|全部|全部| 0.0.0.0/0

JPS

13168 SecondaryNameNode
12977 DataNode
12821 NameNode
13449 NodeManager
14393 Jps
4585 JobHistoryServer
13322 ResourceManager

芯现场

<property>
   <name>fs.defaultFS</name>
   <value>hdfs://localhost:8020</value>
</property>

HDFS现场

 <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>

  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///usr/local/hadoop/.../namenode</value>
  </property>

 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:///usr/local/hadoop/.../datanode</value>
 </property>

我正在尝试从aws加载hdfs文件并使用它做一些事情。

val hdfsLocation = "hdfs://ec2-xx-xx-xx-xx.compute-1.amazonaws.com:8020/tmp/sample.txt"

val spark = SparkSession
      .builder()
      .master("local[*]")
      .appName("sample")
      .getOrCreate()

val sample = spark.read.textFile(hdfsLocation)

错误

使用Spark的默认log4j配置文件： org / apache / spark / log4j-defaults.properties线程中的异常＆＃34; main＆＃34; java.net.ConnectException：从sample-linux / 127.0.1.1调用 ec2 - ***。compute-1.amazonaws.com:8020连接异常失败： java.net.ConnectException：连接被拒绝;有关更多详情，请参阅 http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0（本机方法）在 sun.reflect.NativeConstructorAccessorImpl.newInstance（NativeConstructorAccessorImpl.java:62）在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance（DelegatingConstructorAccessorImpl.java:45） at java.lang.reflect.Constructor.newInstance（Constructor.java:423）在org.apache.hadoop.net.NetUtils.wrapWithMessage（NetUtils.java:791）在org.apache.hadoop.net.NetUtils.wrapException（NetUtils.java:731）在org.apache.hadoop.ipc.Client.call（Client.java:1474）at org.apache.hadoop.ipc.Client.call（Client.java:1401）at org.apache.hadoop.ipc.ProtobufRpcEngine $ Invoker.invoke（ProtobufRpcEngine.java:232）在com.sun.proxy。$ Proxy14.getFileInfo（未知来源）at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo（ClientNamenodeProtocolTranslatorPB.java:752） at sun.reflect.NativeMethodAccessorImpl.invoke0（Native Method）at sun.reflect.NativeMethodAccessorImpl.invoke（NativeMethodAccessorImpl.java:62）在 sun.reflect.DelegatingMethodAccessorImpl.invoke（DelegatingMethodAccessorImpl.java:43）在java.lang.reflect.Method.invoke（Method.java:498）at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod（RetryInvocationHandler.java:187）在 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke（RetryInvocationHandler.java:102）在com.sun.proxy。$ Proxy15.getFileInfo（未知来源）at org.apache.hadoop.hdfs.DFSClient.getFileInfo（DFSClient.java:1977）at org.apache.hadoop.hdfs.DistributedFileSystem $ 18.doCall（DistributedFileSystem.java:1118）在 org.apache.hadoop.hdfs.DistributedFileSystem $ 18.doCall（DistributedFileSystem.java:1114）在 org.apache.hadoop.fs.FileSystemLinkResolver.resolve（FileSystemLinkResolver.java:81）在 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus（DistributedFileSystem.java:1114）在org.apache.hadoop.fs.FileSystem.exists（FileSystem.java:1400）at org.apache.spark.sql.execution.datasources.DataSource $$ anonfun $ 14.apply（DataSource.scala：359）在 org.apache.spark.sql.execution.datasources.DataSource $$ anonfun $ 14.apply（DataSource.scala：348）在 scala.collection.TraversableLike $$ anonfun $ flatMap $ 1.适用（TraversableLike.scala：241）在 scala.collection.TraversableLike $$ anonfun $ flatMap $ 1.适用（TraversableLike.scala：241）在scala.collection.immutable.List.foreach（List.scala：392）at scala.collection.TraversableLike $ class.flatMap（TraversableLike.scala：241）在scala.collection.immutable.List.flatMap（List.scala：355）at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation（DataSource.scala：348）在 org.apache.spark.sql.DataFrameReader.load（DataFrameReader.scala：178）在 org.apache.spark.sql.DataFrameReader.text（DataFrameReader.scala：623）在 org.apache.spark.sql.DataFrameReader.textFile（DataFrameReader.scala：657）在 org.apache.spark.sql.DataFrameReader.textFile（DataFrameReader.scala：632） at aws1 $ .main（aws1.scala：22）在aws1.main（aws1.scala）引起： java.net.ConnectException：拒绝连接 sun.nio.ch.SocketChannelImpl.checkConnect（Native Method）at sun.nio.ch.SocketChannelImpl.finishConnect（SocketChannelImpl.java:717）在 org.apache.hadoop.net.SocketIOWithTimeout.connect（SocketIOWithTimeout.java:206）在org.apache.hadoop.net.NetUtils.connect（NetUtils.java:530）at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494）at org.apache.hadoop.ipc.Client $ Connection.setupConnection（Client.java:609）在 org.apache.hadoop.ipc.Client $ Connection.setupIOstreams（Client.java:707）在 org.apache.hadoop.ipc.Client $ Connection.access $ 2800（Client.java:370）在org.apache.hadoop.ipc.Client.getConnection（Client.java:1523）at org.apache.hadoop.ipc.Client.call（Client.java:1440）... 31更多

如果我遗漏了任何其他信息，请告诉我。这是我第一次使用Hadoop和AWS。我感谢任何帮助。

远程连接到hdfs aws - 连接被拒绝

0 个答案: