我面临以下问题:在打印执行的计划时,我无法查看所有推送的过滤器。
执行的代码是
println(df.queryExecution.executedPlan.treeString(true))
所有计划都已打印,并且在“推送过滤器”字段中如下所示
PushedFilters: [IsNotNull(X1), IsNotNull(X2), IsNotNull(X2), IsNotNull(X3..., ReadSchema:
您可能会注意到,它不能完全打印出来。此外,为解决此问题,我在spark-default.conf
中修改了以下属性spark.debug.maxToStringFields 120000
不幸的是,以前没有解决问题。
关于如何克服这一点的任何建议?
答案 0 :(得分:1)
答案 1 :(得分:0)
您可以执行df.explain(true)
,它将输出整个计划:
== Parsed Logical Plan ==
'SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- 'MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
+- 'DeserializeToObject unresolveddeserializer(createexternalrow(getcolumnbyordinal(0, IntegerType), getcolumnbyordinal(1, IntegerType), getcolumnbyordinal(2, IntegerType), StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false))), obj#65: org.apache.spark.sql.Row
+- Filter isnull(y#9)
+- Filter (x#8 = 0)
+- Project [_1#4 AS x#8, _2#5 AS y#9, _3#6 AS z#10]
+- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._1 AS _1#4, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._2 AS _2#5, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._3 AS _3#6]
+- ExternalRDD [obj#3]
== Analyzed Logical Plan ==
x: int, y: int, z: int
SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
+- DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
+- Filter isnull(y#9)
+- Filter (x#8 = 0)
+- Project [_1#4 AS x#8, _2#5 AS y#9, _3#6 AS z#10]
+- SerializeFromObject [assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._1 AS _1#4, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._2 AS _2#5, assertnotnull(assertnotnull(input[0, scala.Tuple3, true]))._3 AS _3#6]
+- ExternalRDD [obj#3]
== Optimized Logical Plan ==
SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- MapElements <function1>, interface org.apache.spark.sql.Row, [StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)], obj#66: org.apache.spark.sql.Row
+- DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
+- LocalRelation <empty>, [x#8, y#9, z#10]
== Physical Plan ==
*(1) SerializeFromObject [validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, x), IntegerType) AS x#67, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, y), IntegerType) AS y#68, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, z), IntegerType) AS z#69]
+- *(1) MapElements <function1>, obj#66: org.apache.spark.sql.Row
+- *(1) DeserializeToObject createexternalrow(x#8, y#9, z#10, StructField(x,IntegerType,false), StructField(y,IntegerType,false), StructField(z,IntegerType,false)), obj#65: org.apache.spark.sql.Row
+- LocalTableScan <empty>, [x#8, y#9, z#10]