Spark源代码:如何理解withScope方法

时间:2016-06-08 00:11:40

标签: scala apache-spark

我无法理解withScope方法的功能(实际上,我真的不知道RDDOperationScope类的含义)

尤其是,在withScope方法的参数列表中,(body:=> T)的含义是什么:

private[spark] def withScope[T](
  sc: SparkContext,
  name: String,
  allowNesting: Boolean,
  ignoreParent: Boolean)(body: => T): T = {
// Save the old scope to restore it later
val scopeKey = SparkContext.RDD_SCOPE_KEY
val noOverrideKey = SparkContext.RDD_SCOPE_NO_OVERRIDE_KEY
val oldScopeJson = sc.getLocalProperty(scopeKey)
val oldScope = Option(oldScopeJson).map(RDDOperationScope.fromJson)
val oldNoOverride = sc.getLocalProperty(noOverrideKey)
try {
  if (ignoreParent) {
    // Ignore all parent settings and scopes and start afresh with our own root scope
    sc.setLocalProperty(scopeKey, new RDDOperationScope(name).toJson)
  } else if (sc.getLocalProperty(noOverrideKey) == null) {
    // Otherwise, set the scope only if the higher level caller allows us to do so
    sc.setLocalProperty(scopeKey, new RDDOperationScope(name, oldScope).toJson)
  }
  // Optionally disallow the child body to override our scope
  if (!allowNesting) {
    sc.setLocalProperty(noOverrideKey, "true")
  }
  body
} finally {
  // Remember to restore any state that was modified before exiting
  sc.setLocalProperty(scopeKey, oldScopeJson)
  sc.setLocalProperty(noOverrideKey, oldNoOverride)
}
}

您可以使用以下链接找到源代码: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDDOperationScope.scala

任何人都可以帮助我吗?谢谢,我很长时间都对此感到困惑。

2 个答案:

答案 0 :(得分:5)

以下代码可以帮助您

object TestWithScope {
    def withScope(func: => String) = {
        println("withscope")
        func
    }

    def bar(foo: String) = withScope {
        println("Bar: " + foo)
        "BBBB"
    }

    def main(args: Array[String]): Unit = {
        println(bar("AAAA"));
    }
}

可能的输出

withscope
Bar: AAAA
BBBB

答案 1 :(得分:1)

您需要了解如何调用withScope。这是RDD.scala

中的一个例子
.toLowerCase()

基本上它会创建一个新的范围(代码块),因此前一个函数中的变量不会与当前函数混合。范围的主体是在withScope之后传递的内容,在这种情况下是

     /**
   *  Return a new RDD by first applying a function to all elements of this
   *  RDD, and then flattening the results.
   */
  def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U] = withScope {
    val cleanF = sc.clean(f)
    new MapPartitionsRDD[U, T](this, (context, pid, iter) => iter.flatMap(cleanF))
  }

我还没有达到旧范围恢复的程度。