我在下面给出了一个数据帧颜色。
House_No = INT
family_details = ["name" , age , "surname" , weight]
Ownership = Boolean
我想创建一个新的coloum到数据框,名称,年龄,姓氏和&重量。
House_No
family_deatils
Ownership
name
age
surname
wieght
答案 0 :(得分:0)
以下解决方案将为您提供帮助:
val data = Array((2,Array("abc","23","xyz","70"),true),(3,Array("lmn","45","pqr","50"),false))
val rdd = sc.parallelize(data)
val df = rdd.toDF("house_no","family_details","ownership")
val res = df.select("house_no","ownership","family_details").withColumn("name", split($"family_details" (0), ",")(0)).withColumn("age", split($"family_details"(1), ",")(0)).withColumn("surmname", split($"family_details"(2), ",")(0)).withColumn("Weight", split($"family_details"(3), ",")(0)).drop("family_details")