我有两个数据帧:
dataframe1
DATE1|
+----------+
|2017-01-08|
|2017-10-10|
|2017-05-01|
dataframe2
|NAME | SID| DATE1| DATE2|ROLL| SCHOOL|
+------+----+----------+----------+----+--------+
| Sayam|22.0| 8/1/2017| 7 1 2017|3223| BHABHA|
|ADARSH| 2.0|10-10-2017|10.03.2017| 222|SUNSHINE|
| SADIM| 1.0| 1.5.2017| 1/2/2017| 111| DAV|
预期输出
| NAME| SID| DATE1| DATE2|ROLL| SCHOOL|
+------+----+----------+----------+----+--------+
| Sayam|22.0|2017-01-08| 7 1 2017|3223| BHABHA|
|ADARSH| 2.0|2017-10-10|10.03.2017| 222|SUNSHINE|
| SADIM| 1.0|2017-05-01| 1/2/2017| 111| DAV|
我想将dataframe2中的DATE1
列替换为dataframe1的DATE1
列。我需要一个通用的解决方案。
任何帮助将不胜感激。
我尝试了withColumn
方法,如下所示
dataframe2.withColumn(newColumnTransformInfo._1, dataframe1.col("DATE1").cast(DateType))
但是,我收到了错误:
org.apache.spark.sql.AnalysisException: resolved attribute(s)
答案 0 :(得分:2)
您无法从其他数据框添加列
您可以做的是加入两个数据框并保留所需的列,数据框都必须具有公共连接列。如果您没有公共列并且数据是按顺序排列的,则可以为两个数据帧分配增加的ID,然后加入。
以下是您案例的简单示例
//Dummy data
val df1 = Seq(
("2017-01-08"),
("2017-10-10"),
("2017-05-01")
).toDF("DATE1")
val df2 = Seq(
("Sayam", 22.0, "2017-01-08", "7 1 2017", 3223, "BHABHA"),
("ADARSH", 2.0, "2017-10-10", "10.03.2017", 222, "SUNSHINE"),
("SADIM", 1.0, "2017-05-01", "1/2/2017", 111, "DAV")
).toDF("NAME", "SID", "DATE1", "DATE2", "ROLL", "SCHOOL")
//create new Dataframe1 with new column id
val rows1 = df1.rdd.zipWithIndex().map{
case (r: Row, id: Long) => Row.fromSeq(id +: r.toSeq)}
val dataframe1 = spark.createDataFrame(rows1, StructType(StructField("id", LongType, false) +: df1.schema.fields))
//create new Dataframe2 with new column id
val rows2= df2.rdd.zipWithIndex().map{
case (r: Row, id: Long) => Row.fromSeq(id +: r.toSeq)}
val dataframe2 = spark.createDataFrame(rows2, StructType(StructField("id", LongType, false) +: df2.schema.fields))
dataframe2.drop("DATE1")
.join(dataframe1, "id")
.drop("id").show()
输出:
+------+----+----------+----+--------+----------+
| NAME| SID| DATE2|ROLL| SCHOOL| DATE1|
+------+----+----------+----+--------+----------+
| Sayam|22.0| 7 1 2017|3223| BHABHA|2017-01-08|
|ADARSH| 2.0|10.03.2017| 222|SUNSHINE|2017-10-10|
| SADIM| 1.0| 1/2/2017| 111| DAV|2017-05-01|
+------+----+----------+----+--------+----------+
希望这有帮助!