使用位于另一个表中的另一列加入位于Struct中的列

时间:2017-05-05 09:21:05

标签: scala apache-spark-sql spark-dataframe

我想使用spark-sql连接两个表但是在struct中的一个列和另一个列之间,为此,我做了:

 val prs = sc.read.json(dir+"/prs.json")
  prs.createGlobalTempView("prs")

  val st = sc.read.json(dir+"/struct.json")
  st.createGlobalTempView("struct")

  prs.join(st,"entclasses.struct").show

输入文件: pers:

{"nom":"hamdane","prenom":"mounir", "entclasses":[{"struct":"526","classe":"501","mef":"20114"},{"cle":"5174","classe":"5581","mef":"201012414"}]}
{"nom":"hamdanes","prenom":"mounirs", "entclasses":[{"struct":"52614","classe":"501","mef":"20114"},{"cle":"5174","classe":"5581","mef":"201012414"}]}

STRUCT:

{"cle":"526","df":"125"}
{"cle":"5174","df":"015"}
{"cle":"5581","df":"105"}

谢谢。

0 个答案:

没有答案