我在本地Eclipse Scala IDE上运行spark代码。我在AWS上创建了EMR spark集群。我需要访问远程spark来在IDE本身运行我的应用程序。我已经通过制作一个jar并使用spark-submit运行该jar来尝试此代码。但是现在我想在我的IDE中运行这个spark应用程序。 基本上,我的代码是从AWS s3读取数据。
以下是我的代码:
package com.S3connection
import org.apache.hadoop.conf.Configuration
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import com.amazonaws._
import com.amazonaws.auth._
import com.amazonaws.services.s3.model.GetObjectRequest
import java.io.File;
import org.apache.spark.sql.functions._
object sparktest {
def main(args:Array[String]){
val sparkConf = new SparkConf()
.setMaster("spark://ec2-22-104-209-46.compute-
1.amazonaws.com:7077")
.setAppName("S3")
.set("spark.driver.host", "174.61.47.360")
val sc = new SparkContext(sparkConf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
//val yourAWSCredentials = new BasicAWSCredentials("AKIAJ6T3YPF5CCFD4D3A", "ayit1dlgbAg2rAdi3zEUm/bQncQuHuNMVcMDD/4D")
sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "ZK111IABDG3DDNBDG4D3A")
sc.hadoopConfiguration.set("fs.s3.awsSecretAccessKey", "ayit1dlgjbfjdfgbjbdgknQncQuHvhdbcsv12D/6D")
val student = sqlContext.read.json("s3://Informatica-files/students.json")
student.show()
}
}
我已将master设置为ec-2主节点URL,它也是spark master和spark-driver host作为我的主节点IP
在Eclipse中执行代码后,它显示以下错误。
17/06/11 22:31:25 INFO SecurityManager: Changing modify acls groups to:
17/06/11 22:31:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(gbhog); groups with view permissions: Set(); users with modify permissions: Set(gbhog); groups with modify permissions: Set()
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 WARN Utils: Service 'sparkDriver' could not bind on port 0. Attempting port 1.
17/06/11 22:31:27 ERROR SparkContext: Error initializing SparkContext.
java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Unknown Source)
at sun.nio.ch.Net.bind(Unknown Source)
at sun.nio.ch.ServerSocketChannelImpl.bind(Unknown Source)
at sun.nio.ch.ServerSocketAdaptor.bind(Unknown Source)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
如何使用远程Ec-2 spark解决此错误并在Eclipse中执行我的程序。从我的IDE访问AWS上的远程spark也需要任何网络配置吗?