我的数据框中包含“CUSTOMER_MAILID”,“OFFER_NAME”,“OFFER_ISAPPLIED”列。
示例数据:
converter
如果“OFFER_NAME”列有一些值,我想用“Y”更新“OFFER_ISAPPLIED”列值,除了Null。
我怎样才能实现它?
输出应该是这样的:
+--------------------+--------------------+---------------+
| CUSTOMER_MAILID| OFFER_NAME|OFFER_ISAPPLIED|
+--------------------+--------------------+---------------+
|pushpendrakaushik...|Jaipur Pink Panth...| N|
|pushpendrakaushik...|Jaipur Pink Panth...| N|
|dr.kshitijmathur@...| | N|
|spdadhichassociat...| | N|
|vinod.gogia@herom...|Jaipur Pink Panth...| N|
|prerak0401@gmail.com| | N|
| garhwalsp@gmail.com| | N|
|muditsharma1985@g...| | N|
| amit1185@gmail.com|Jaipur Pink Panth...| N|
答案 0 :(得分:4)
使用:
from pyspark.sql.functions import *
df.withColum("OFFER_ISAPPLIED",
when(col("OFFER_NAME").isNull(), "N").otherwise("Y"))
答案 1 :(得分:0)
这可能是一个解决方案:
from pyspark.sql.functions import *
df.select("CUSTOMER_MAILID", "OFFER_NAME" , when(col("OFFER_NAME").isNull(),"N").otherwise("Y").alias("OFFER_ISAPPLIED"))