我试图测试一些将数据帧转换为数据集的字符串[String]
我使用以下spec2代码来测试我的模块是否在特定条件下抛出异常。 问题是spec2确实跟踪异常并且在测试失败的情况下,即使抛出了正确的异常, 有没有办法让spec2拦截火花的例外?
我的测试代码:
class DataframeUtilTest extends Specification
{
val spark:SparkSession = SparkSession
.builder()
.master("local[2]")
.getOrCreate()
"dataframe utils" should {
" throw an exception on struct type field" in {
import spark.implicits._
val valuesSequesnceRDD = spark.sparkContext.parallelize(Seq((1L, 3.0, "a","2017-01-15 15:34:12",Seq(1,2,3))).
map(x => Row(x._1,x._2,x._3,x._4,x._5)))
val schema = StructType(Seq(StructField("user_id",LongType),
StructField("filed2",DoubleType),
StructField("name",StringType),
StructField("date",StringType),
StructField("array",ArrayType(IntegerType))))
lazy val df = spark.createDataFrame(valuesSequesnceRDD,schema).
withColumn("datetime",col("date").cast(TimestampType)).
withColumn("date",col("date").cast(DateType)).
withColumn("struct_all",struct("user_id","filed2","name","date"))
df.map(jsonUtils.rowToJsonString) must throwA[Exception]
}
}
应该生成错误的代码:
object jsonUtils{
def rowToMap(row:Row):Map[String,Any]={
val new_map = (row.schema.fieldNames zip row.toSeq).map(x=>x._1 -> x._2).toMap
// finding if there is a struct type fields - we currently not supporting this
if (new_map.exists(x => x._2.getClass.getName == "org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema")){
throw new Exception("structType desiriliazation is not supported")
}
else{
new_map
}
}
def rowToJsonString(row:Row):String ={
val mapper = new ObjectMapper() with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
val dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss")
mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)
mapper.setDateFormat(dateFormat)
mapper.writeValueAsString(rowToMap(row))
}
}
失败的测试消息:
Expected: java.lang.Exception. Got nothing
java.lang.Exception: Expected: java.lang.Exception. Got nothing