无法在单元测试中捕获NumberFormatException

时间:2017-01-04 09:41:21

标签: scala unit-testing apache-spark

我有一个必须在目的上失败的单元测试,但我无法捕获它,所以这很奇怪。

这就是csv文件的外观:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([\w-]+).php$ publications-$1.php [L]

这是json架构文件的内容:

curva;clase;divisa;rw
AED_FXDEP;OIS;AED;240,1000
ARS :Std;6m;ARS;240
AUD_CALMNY_DISC;OIS;AUD;169.7056275
AUD_DEPO_BBSW;6m;AUD;169.7056275
AUD_DEPO_BBSW;6m;AUD;

我认为它是可自我解释的,csv的最后一行有一个空字段,这是不允许的,异常是明确的,NumberFormatException,因为你可以创建一个空值的数字。我想在单元测试中发现异常,为什么我无法达到它?

这是引发异常的代码:

{"type" : "struct","fields" : [ {"name" : "curve","type" : "string","nullable" : false}, {"name":"class", "type":"string", "nullable":false}, {"name":"currency", "type":"string", "nullable":false}, {"name":"rw", "type":"string","nullable":false} ]

validateGenericCSV的方法如下:

try{
    val validateGenericFile : Boolean = CSVtoParquet.validateGenericCSV(pathCSVWithHeaderWithErrors,
                                                                      pathCurvasJsonSchemaWithDecimal,
                                                                      _nullValue,
                                                                      _delimiter,
                                                                      sc,
                                                                      sqlContext)
    //never reach!
    Assert.assertTrue(validateGenericFile)
} catch {
    case e:NumberFormatException => Assert.assertTrue("ERROR! " + e.getLocalizedMessage,false)
    case ex:Exception => Assert.assertTrue("ERROR! " + ex.getLocalizedMessage,false)
} finally {
    println("Done testValidateInputFilesFRTBSTDES436_WithErrors!")
}

为什么在validateGenericCSV方法中永远不会达到NumberFormatException?

更新

我修改了这些行:

val myDfWithCustomSchema = _sqlContext.read.format("com.databricks.spark.csv").
                                            option("header", "true").
                                            option("delimiter", _delimiter).
                                            option("nullValue", _nullValue).
                                            option("mode","FAILFAST").
                                            schema(mySchemaStructType).
                                            load(fileToReview)

var finallyCorrect : Boolean = true

var contLinesProcessed = 1

try{
  //this line provokes the exception!
  val myArray = myDfWithCustomSchema.collect


  var contElementosJson = 0

  var isTheLineCorrect: Boolean = true


  myArray.foreach { elem =>
    println("Processing line with content: " + elem)
    for (myElem <- myList) {
      val actualType = myElem.`type`
      val actualName = myElem.name
      val actualNullable = myElem.nullable

      if (contElementosJson == myList.size) {
        contElementosJson = 0
      }

      if (actualType == "string") {
        val theField = elem.getString(contElementosJson)
        val validatingField: Boolean = theField.isInstanceOf[String]

        isTheLineCorrect = validatingField && !((theField == "" || theField == null) && !actualNullable)
        contElementosJson += 1
        if (!isTheLineCorrect){
          finallyCorrect=false
          println("ATTENTION! an empty string chain. " + "Check this field " + actualName + " in the csv file, which should be a " + actualType + " according with the json schema file, can be nullable? " + actualNullable + " isTheLineCorrect? " + isTheLineCorrect)
        }
      } else if (actualType == "integer") {
        val theField = elem.get(contElementosJson)
        val validatingField: Boolean = theField.isInstanceOf[Integer]
        isTheLineCorrect = validatingField && !((theField == "" || theField == null) && !actualNullable)
        contElementosJson += 1
        if (!isTheLineCorrect){
          finallyCorrect=false
          println("ATTENTION! an empty string chain. " + "Check this field " + actualName + " in the csv file, which should be a " + actualType + " according with the json schema file, can be nullable? " + actualNullable + " isTheLineCorrect? " + isTheLineCorrect)
        }
      } else if (actualType.startsWith("decimal")) {
        val theField = elem.get(contElementosJson)
        val validatingField: Boolean = theField.isInstanceOf[java.math.BigDecimal]
        isTheLineCorrect = validatingField && !((theField == "" || theField == null) && !actualNullable)
        contElementosJson += 1
        if (!isTheLineCorrect){
          finallyCorrect=false
          println("ATTENTION! an empty string chain. " + "Check this field " + actualName + " in the csv file, which should be a " + actualType + " according with the json schema file, can be nullable? " + actualNullable + " isTheLineCorrect? " + isTheLineCorrect)
        }
      } else {
        println("Attention! se está intentando procesar una columna del tipo " + actualType + " que no está prevista procesar. Comprobar.")
      }
    } //for
    contLinesProcessed += 1
  } //foreach))

} catch {
  //NEVER REACHED! why????
  case e:NumberFormatException => throw e
  case ex:Exception => throw ex
}

这些行:

case e:NumberFormatException => Assert.assertTrue("ERROR! " + e.getLocalizedMessage,true)
case ex:Exception => Assert.assertTrue("ERROR! " + ex.getLocalizedMessage,true)

同样的错误,我的问题是当异常发生时我无法达到捕捉句子!

谢谢

2 个答案:

答案 0 :(得分:1)

检查堆栈跟踪时,我们可以看到以下内容:

  

ERROR!作业因阶段失败而中止:阶段1.0中的任务1失败1次,最近失败:阶段1.0中失去的任务1.0(TID 2,localhost):java.lang.NumberFormatException

Spark是一个分布式计算框架。在处理任务时,NumberFormatException正在其中一个执行程序上远程执行 。 Spark从该执行程序获取TaskFailure,并将包含在org.apache.spark.SparkException中的异常传播到触发计算实现的操作:此特定情况下的.collect()方法。

如果我们想了解失败背后的原因,我们可以使用ex.getCause。 实际上我们会有类似这样的代码片段:

catch {
  case ex:Exception if ex.getCause.getClass == classOf[NumberFormatException] => Assert.fail("Number parsing failed" + e.getLocalizedMessage)
  case ex:Exception => Assert.fail(...)
}

答案 1 :(得分:0)

测试不会失败,因为Assert.assertTrue(...,true)没有失败。如果第二个参数为assertTrue,则false会失败,但true时则会失败。