如何在scala中使用x!= null?

时间:2017-09-20 02:18:00

标签: scala apache-spark spark-dataframe

DataFrame df01如下:

scala> df01.show
+--------------------+----+-----+
|          session_id|  材质|count|
+--------------------+----+-----+
|       360098626|120|  金属|    2|
|866693025201992-0...|  布艺|    2|
|        648401717|33|  其它|    1|
|b2df486d906403886...| ABS|    1|
|14962864822301789...|  金属|    2|
|        960455526|12|  金属|    1|
|14886198008411946...| PVC|    1|
|860410037295987-6...|  金属|    1|
|c267e7e20c6742e6d...| ABS|    1|
|862788039750580-1...| ABS|    2|
|85995192767403132...| ABS|    1|
|862681034959357-2...| ABS|    1|
|52f4754fe212caf9d...|  其它|    1|
| 51289594708875916|6|null|    1|
|        741995028|24|null|    1|
|        2099986503|5|  金属|    1|
|14965600686729437...|null|    1|
|15098023912712771...| ABS|    2|
|a28fe88a99e3983c6...|  金属|    2|
|         703270023|2|null|    1|
+--------------------+----+-----+
only showing top 20 rows

scala> df01.schema
res58: org.apache.spark.sql.types.StructType = StructType(StructField(session_id,StringType,true), StructField(材质,StringType,true), StructField(count,LongType,false))

我想要做的是当列材质质== null时,计数为1.代码如下:

val e = "材质"

类型1:attr!= null

 val df02 = df01.map{x=>
        val session_id = x(0).toString()
        val attr = x(1).toString()
        var cnt = 1
        if(attr!=null){cnt = x(2).toString().toInt}        
        (session_id,attr,cnt)
       }.toDF("session_id",e,"cnt")

类型2:attr!=“null”

val df02 = df01.map{x=>
    val session_id = x(0).toString()
    val attr = x(1).toString()
    var cnt = 1
    if(attr!="null"){cnt = x(2).toString().toInt}        
    (session_id,attr,cnt)
   }.toDF("session_id",e,"cnt")  

类型3:x(1)!= null

val df02 = df01.map{x=>
    val session_id = x(0).toString()
    val attr = x(1).toString()
    var cnt = 1
    if(x(1)!=null){cnt = x(2).toString().toInt}        
    (session_id,attr,cnt)
   }.toDF("session_id",e,"cnt")

类型4:x(1)!=“null”

val df02 = df01.map{x=>
    val session_id = x(0).toString()
    val attr = x(1).toString()
    var cnt = 1
    if(x(1)!="null"){cnt = x(2).toString().toInt}        
    (session_id,attr,cnt)
   }.toDF("session_id",e,"cnt")

以上所有类型都是错误“引起:java.lang.NullPointerException ”。怎么做对了?

2 个答案:

答案 0 :(得分:0)

@ Psidom 评论中的答案是正确的:

df01.withColumn("count", when(col(e).isNull, 1).otherwise(col("count")))

答案 1 :(得分:0)

当“材质”列具有空值时,x(1).toString中将出现NullPointException。

我认为@Psidom评论中的答案是正确的。