原始数据框
0.2 0.3
+------+------------- -+
| name| country |
+------+---------------+
|Raju |UAS |
|Ram |Pak. |
|null |China |
|null |null |
+------+--------------+
I Need this
+------+--------------+
|Nwet|wet Con |
+------+--------------+
|0.2 | 0.3 |
|0.2 | 0.3 |
|0.0 | 0.3. |
|0.0 | 0.0 |
+------+--------------+
我想创建一个Udf。对于Both列
这将应用于名称列,它检查是否不为null,然后返回0.2返回0.0。
和相同的Udf应用于country列,如果它为null,则返回0.0不为null则返回0.3
答案 0 :(得分:0)
使用Apache的StringUtils:
val transcodificationName: UserDefinedFunction =
udf { (name: String) => {
if (StringUtils.isBlank(name)) 0.0
else 0.2
}
}
val transcodificationCountry: UserDefinedFunction =
udf { (country: String) => {
if (StringUtils.isBlank(country)) 0.0
else 0.3
}
}
dataframe
.withColumn("Nwet", transcodificationName(col("name"))).cast(DoubleType)
.withColumn("wetCon", transcodificationCountry(col("country"))).cast(DoubleType)
.select("Nwet", "wetcon")
编辑:
val transcodificationColumns: UserDefinedFunction =
udf { (input: String, columnName:String) => {
if (StringUtils.isBlank(country)) 0.0
else if(columnName.equals("name")) 0.2
else if(columnName.equals("country") 0.3
else 0.0
}
}
dataframe
.withColumn("Nwet", transcodificationColumns(col("name"), "name")).cast(DoubleType)
.withColumn("wetCon", transcodificationColumns(col("country")), "country").cast(DoubleType)
.select("Nwet", "wetcon")