我已经编写了一个spark批处理作业,并在完成批处理执行后立即在spark上下文中调用了stop()方法。然后我在集群模式下在Yarn(AWS)中执行了这个批处理。它运行正常,但应用程序不会自行结束。我必须从YARN UI中明确地删除它。我认为已经调用了sparkContext.stop()并且上下文已经结束,但是应用程序没有结束。这是我试图执行的代码 -
@Nullable
@Override
public View onCreateView(final LayoutInflater inflater, @Nullable ViewGroup container, @Nullable Bundle savedInstanceState) {
View view = inflater.inflate(R.layout.fragment_photo_comments, container, false);
ButterKnife.bind(this, view);
Timber.i("onCreateView data.size == %d", commentArrayList.size());
setToolbarTitle();
Picasso.with(getActivity())
.load(photo)
.placeholder(R.drawable.ic_timeline_image_placeholder)
.centerCrop()
.fit()
.into(ivPhoto);
if (hasCommentsVisible) {
Timber.i("comments are visible!! and dataSize == %d", commentArrayList.size());
llFlagsCommentsContainer.setVisibility(View.GONE);
rvCommentsList.setVisibility(View.VISIBLE);
}
tvComments.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
hasCommentsVisible = true;
llFlagsCommentsContainer.setVisibility(View.GONE);
rvCommentsList.setVisibility(View.VISIBLE);
}
});
initRecyclerView();
return view;
}
使用Spark应用程序提交命令 -
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.newAPIHadoopRDD(inputConfiguration, TableInputFormat.class, ImmutableBytesWritable.class, Result.class)
.mapToPair(t -> {
HbaseEntity entity = new HbaseEntity(t._2(), toTableName, columnFamily);
return new Tuple2<>(entity.getKey(), entity);
})
.reduceByKey(hbaseEntityReducer)
.mapToPair(t -> t._2().createPutTuple())
.saveAsHadoopDataset(outJobConf);
sc.stop();
我无法理解为什么YARN应用程序没有结束或完成。谁能帮我这个?还告诉我如何进行操作,以便在执行stop()方法后应用程序结束。