输入数据的格式如下:
public NotesFragment extends Fragment{
private InputPassingInterface inpPassInterface;
public static NotesFragment newInstance(){
return new NotesFragment();
}
@Override
public void onAttach(Context context) {
super.onAttach(context);
this.inpPassInterface = (InputPassingInterface) context;
}
@Override
public View onCreateView(LayoutInflater inflater,
ViewGroup container,
Bundle savedInstanceState) {
return inflater.inflate(R.layout.fragment_notes, container, false);
}
@Override
public View onViewCreated(View view, @Nullable Bundle b) {
// Instead of onCreateView,
// do all of your view-updates from this method
// for the sake of efficiency.
// ... all your view initialization codes go here
btDone.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
if(inpPassInterface!=null)
inpPassInterface.passInput(
etNotes.getText().toString()
);
}
});
}
}
输出格式如下():
+--------------------+-------------+--------------------+
| StudentID| Right | Wrong |
+--------------------+-------------+--------------------+
| studentNo01 | a,b,c | x,y,z |
+--------------------+-------------+--------------------+
| studentNo02 | c,d | v,w |
+--------------------+-------------+--------------------+
权利意味着1,错误意味着0。
我想使用Spark map函数或udf处理这些数据,但我不知道如何处理它。你能帮我吗?谢谢。
答案 0 :(得分:3)
使用拆分和爆炸两次并执行联合
val df = List(
("studentNo01","a,b,c","x,y,z"),
("studentNo02","c,d","v,w")
).toDF("StudenID","Right","Wrong")
+-----------+-----+-----+
| StudenID|Right|Wrong|
+-----------+-----+-----+
|studentNo01|a,b,c|x,y,z|
|studentNo02| c,d| v,w|
+-----------+-----+-----+
val pair = (
df.select('StudenID,explode(split('Right,",")))
.select(concat_ws(",",'StudenID,'col).as("key"))
.withColumn("value",lit(1))
).unionAll(
df.select('StudenID,explode(split('Wrong,",")))
.select(concat_ws(",",'StudenID,'col).as("key"))
.withColumn("value",lit(0))
)
+-------------+-----+
| key|value|
+-------------+-----+
|studentNo01,a| 1|
|studentNo01,b| 1|
|studentNo01,c| 1|
|studentNo02,c| 1|
|studentNo02,d| 1|
|studentNo01,x| 0|
|studentNo01,y| 0|
|studentNo01,z| 0|
|studentNo02,v| 0|
|studentNo02,w| 0|
+-------------+-----+
您可以按如下方式转换为RDD
val rdd = pair.map(r => (r.getString(0),r.getInt(1)))