我已将两个表中的数据联接起来,我想将其转换为复杂的数据类型(Map
)。
// creating data frame of product data file
val proDF = spark.read.format("parquet").load("path")
// creating data frame of Finance data file
val finDF = spark.read.format("parquet").load("path")
// joining both the data frames
val proFinJoinDF = proDF.joinWith(finDF, proDF("col1") === finDF("col1") && proDF("col2") === finDF("col2") && proDF("col3") === finDF("col3"))
// saving joined data into temporary table
proFinJoinDF.registerTempTable("join_Data")
val new1 = spark.sql("""select
_1.col1 as A,
_1.col2 as B,...
_2.col1 as P,
_2.col2 as Q,
_2.col3 as R... from join_data""" )
// to convert data type to "string" for Map
var strnew1 = new1.select(new1.columns.map(c => col(c).cast(StringType)) : _*)
// creating case class for Map type
case class Pro_Fin(A: String, B: String, ProMap: Map[String, String], FinMap: Map[String, String])
现在,我想将合并的数据转换为上述案例类(Pro_Fin
)并将其保存到表中。
预期输出:
A B proMap<c:value, d:value …> finMap<P:value, Q:value …>