如何在Spark Scala中将任何元素的数组转换为数据帧?

时间:2019-06-28 06:00:29

标签: scala apache-spark

我有一个像Array[(Any, Any, Any)]这样的数组。例如:

l1 =  [(a,b,c),(d,e,f),(x,y,z)]

我想将其转换为以下数据框:

c1    c2    c3
a     b     c
d     e     f
x     y     z

我试图将现有数据框转换为列表:

val l1= test_df.select("c1","c2","c3").rdd.map(x => 
(x(0),x(1),x(2))).collect()
println (lst) 
val c = Seq(l1).toDF("c1","c2","c3") 
c.show()

但是它抛出此错误:

xception in thread "main" java.lang.ClassNotFoundException: scala.Any
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)

1 个答案:

答案 0 :(得分:0)

在Pyspark中:

l1 =  [('a','b','c'),('d','e','f'),('x','y','z')]
sdf=spark.createDataFrame(l1)
sdf.show()