我正在尝试将实际数据帧与scala测试中的预期数据进行比较。 两个数据帧都有一个Int类型的count列,但是在比较两个帧时,我得到了如下的精度错误:
status=[]
def has23(nums):
for num in nums:
if num == 2 or num == 3: status.append(True);
else: status.append(False)
return status
print has23([4,3])
有人可以建议我要去哪里吗?
以下是实际df的代码段:
Row
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 30-39| 1|20190906|
+--------+-----+--------+
was considered not equal to
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 30-39| 1|20190905|
+--------+-----+--------+
Row
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 80-89| 2|20190906|
+--------+-----+--------+
was considered not equal to
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 80-89| 2|20190905|
+--------+-----+--------+
Row
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 90-99| 1|20190906|
+--------+-----+--------+
was considered not equal to
+--------+-----+--------+
|agerange|count|datadate|
+--------+-----+--------+
| 90-99| 1|20190905|
+--------+-----+--------+
schema tolerance:
* precisions by column
[*, 1.0E-6]
* ignore nullable flag for each column
* ignore column order
预期DF的代码:
val windowSpec = Window
.partitionBy("agerange")
rangeDf
.select("agerange")
.withColumn("count", count("agerange") over windowSpec cast("Int"))
.distinct()
.withColumn("datadate", lit(runningDate.runningDateString))