在yarn上运行Java Spark Application但在bulkput停止到hbase

时间:2017-08-05 04:05:31

标签: java apache-spark hbase

我已经使用ideaj在master本地测试了我的应用程序,并且每件事情都运行良好。但是,当我使用spark-submit并将master设置为yarn时,我的应用程序停止在hbase的bulkput方法中。这是我的代码和配置文件hbase-site.xml和spark-env.sh

System.setProperty("HADOOP_USER_NAME", "user");
        SparkConf sparkConf = new SparkConf().setAppName("appname").setMaster("yarn");
        sparkConf.set("spark.yarn.dist.files", "/opt/hadoop-2.8.0/etc/hadoop/yarn-site.xml");
//        sparkConf.set("spark.yarn.jars","hdfs://localhost:9000/user/user/jars/*.jar");
        sparkConf.set("spark.yarn.archive","hdfs://localhost:9000/user/user/jars");
        sparkConf.setSparkHome("/opt/spark");
        sparkConf.set("spark.shuffle.blockTransferService", "nio");
        sparkConf.set("spark.executor.instances", "30");
        sparkConf.set("spark.executor.cores", "3");
        sparkConf.set("spark.executor.memory", "5G");
        sparkConf.set("spark.driver.memory", "3G");
        sparkConf.set("spark.driver.maxResultSize", "10G");
        JavaSparkContext jsc = new JavaSparkContext(sparkConf);
        Properties connectionProperties = new Properties();
        connectionProperties.put("user","postgres");
        connectionProperties.put("password","password");
        connectionProperties.put("driver","org.postgresql.Driver");

Hbase操作

rdd = jsc.parallelize(list);  
      //list is a List contains string in format rowkey,cf,column,value
System.out.println(rdd.toString());
hbaseContext.bulkPut(rdd,
                     TableName.valueOf(tableName),
                     new JavaHBaseBulkPutExample.PutFunction());
//here is where application stop

HBase的-site.xml中

<configuration>
        <property>
        <name>hbase.rootdir</name>
        <value>hdfs://localhost:9000/hbase</value>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
    </property>
     <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>127.0.0.1</value>
    </property>
    <property>
        <name>dfs.support.append</name>
        <value>true</value>
    </property>
</configuration>

spark-env.sh

export SPARK_MASTER_IP=127.0.0.1 
export SPARK_MASTER_PORT=7077 
export SPARK_LOCAL_IP=127.0.0.1
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native

0 个答案:

没有答案