Question

我正在运行用Scala 2.9.3编写的spark应用程序。下面是函数network，它创建了一个节点树。每个节点都有一组唯一的邻居，它们是该特定节点的子节点。我面临的问题是对象current是for循环内的不同对象（通过其不同的地址显而易见）。如何防止这种情况，并在我在外面声明的同一个对象上执行for循环？

    def network (root: Node) : Tree =
    {
        var tree = new Tree(root)
        var queue = ListBuffer[Node](root)

        while (!queue.isEmpty &&  queue(0).level<maxlen)
        {
            var current: Node = queue.remove(0)
            println(">>>>>>>>>>>>>>>>>> Current1: "+current)
            var neigh = findNeighbor(current.userID)
            for (n <- neigh)
            {
                    if(tree.search(n._1) == null)
                    {
                            var c = new Node(n._1, current.level+1, n._2, n._3)
                            current.addChild(c)
                            println(">>>>>>>>>>>>>>>>>> Current2: "+current)
                    }
            }
            println(">>>>>>>>>>>>>>>>>> Current3: "+current)
            queue ++= current.neighbors
        }
        return tree
    }

以下是代码的输出。检查由Current1，Current2和Current3表示的3个位置中current的值。我们观察到Current1 == Current3

[vijaygkd@is-joshbloom-hadoop network]$ sbt run
Loading /usr/local/sbt/bin/sbt-launch-lib.bash
[info] Set current project to Network (in build file:/usr/local/spark/test/vijay/network/)
[info] Compiling 1 Scala source to /usr/local/spark/test/vijay/network/target/scala-2.9.3/classes...
[info] Running Network 
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
>>>>>>>>>>>>>>>>>> Current1: Node@76ab909a
13/10/13 14:23:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/10/13 14:23:50 WARN snappy.LoadSnappy: Snappy native library not loaded
13/10/13 14:23:50 INFO mapred.FileInputFormat: Total input paths to process : 1
>>>>>>>>>>>>>>>>>> Current2: Node@4f9e2851
>>>>>>>>>>>>>>>>>> Current2: Node@4f9e2851
>>>>>>>>>>>>>>>>>> Current2: Node@4f9e2851
>>>>>>>>>>>>>>>>>> Current2: Node@4f9e2851
>>>>>>>>>>>>>>>>>> Current3: Node@76ab909a
[success] Total time: 11 s, completed Oct 13, 2013 2:23:51 PM

附加信息：findNeighbor返回包含节点userID的邻居元组的RDD。 tree.search函数检查树以查找树中是否已存在n。如果节点不存在于树中，则仅将节点添加为子节点。所有功能都按预期工作。

Answer 1

我认为您的代码可能存在多个问题。你在某处重新分配current吗？那么也许应该是val？另外，为什么假设toString的{{1}}实现打印出对象的“地址”？它只是打印Node吗？在这种情况下，由于其内部字段发生变化，其hashCode可能会发生变化。另外，您可以查看此页面：https://meta.stackexchange.com/questions/22754/sscce-how-to-provide-examples-for-programming-questions

Answer 2

底层的Spark框架似乎在某种程度上操纵了对象。当我在for循环中访问RDD neigh时，它必须这样做。将RDD转换为列表（因为对象的数量很小）解决了这个问题。

Scala：对象的范围

2 个答案: