coursera progfun1:scala union performance

时间:2016-08-05 15:22:51

标签: performance scala functional-programming time-complexity

在编写课程第3周的任务时,在Scala'中编写功能编程原则。 @ coursera我发现当我实现视频课程中显示的函数union时:

  override def union(that: TweetSet): TweetSet = {
    left union(right) union(that) incl(elem)
  }

执行过程需要很长时间,但是当我以这种方式实现它时:

  override def union(that: TweetSet): TweetSet = {
    right.union(left.union(that)).incl(elem)
  }

执行期间花费的时间更少,我得到满分。

问题在于,我无法确定这两种实现之间的差异,一种比另一种更快?

为赋值给出的代码(使用的数据结构的实现)是:

package objsets

import TweetReader._

/**
 * A class to represent tweets.
 */
class Tweet(val user: String, val text: String, val retweets: Int) {
  override def toString: String =
    "User: " + user + "\n" +
    "Text: " + text + " [" + retweets + "]"
}

/**
 * This represents a set of objects of type `Tweet` in the form of a binary search
 * tree. Every branch in the tree has two children (two `TweetSet`s). There is an
 * invariant which always holds: for every branch `b`, all elements in the left
 * subtree are smaller than the tweet at `b`. The elements in the right subtree are
 * larger.
 *
 * Note that the above structure requires us to be able to compare two tweets (we
 * need to be able to say which of two tweets is larger, or if they are equal). In
 * this implementation, the equality / order of tweets is based on the tweet's text
 * (see `def incl`). Hence, a `TweetSet` could not contain two tweets with the same
 * text from different users.
 *
 *
 * The advantage of representing sets as binary search trees is that the elements
 * of the set can be found quickly. If you want to learn more you can take a look
 * at the Wikipedia page [1], but this is not necessary in order to solve this
 * assignment.
 *
 * [1] http://en.wikipedia.org/wiki/Binary_search_tree
 */
abstract class TweetSet {

  /**
   * This method takes a predicate and returns a subset of all the elements
   * in the original set for which the predicate is true.
   *
   * Question: Can we implment this method here, or should it remain abstract
   * and be implemented in the subclasses?
   */
    def filter(p: Tweet => Boolean): TweetSet = ???

  /**
   * This is a helper method for `filter` that propagetes the accumulated tweets.
   */
  def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet

  /**
   * Returns a new `TweetSet` that is the union of `TweetSet`s `this` and `that`.
   *
   * Question: Should we implment this method here, or should it remain abstract
   * and be implemented in the subclasses?
   */
    def union(that: TweetSet): TweetSet = ???

  /**
   * Returns the tweet from this set which has the greatest retweet count.
   *
   * Calling `mostRetweeted` on an empty set should throw an exception of
   * type `java.util.NoSuchElementException`.
   *
   * Question: Should we implment this method here, or should it remain abstract
   * and be implemented in the subclasses?
   */
    def mostRetweeted: Tweet = ???

  /**
   * Returns a list containing all tweets of this set, sorted by retweet count
   * in descending order. In other words, the head of the resulting list should
   * have the highest retweet count.
   *
   * Hint: the method `remove` on TweetSet will be very useful.
   * Question: Should we implment this method here, or should it remain abstract
   * and be implemented in the subclasses?
   */
    def descendingByRetweet: TweetList = ???

  /**
   * The following methods are already implemented
   */

  /**
   * Returns a new `TweetSet` which contains all elements of this set, and the
   * the new element `tweet` in case it does not already exist in this set.
   *
   * If `this.contains(tweet)`, the current set is returned.
   */
  def incl(tweet: Tweet): TweetSet

  /**
   * Returns a new `TweetSet` which excludes `tweet`.
   */
  def remove(tweet: Tweet): TweetSet

  /**
   * Tests if `tweet` exists in this `TweetSet`.
   */
  def contains(tweet: Tweet): Boolean

  /**
   * This method takes a function and applies it to every element in the set.
   */
  def foreach(f: Tweet => Unit): Unit
}

class Empty extends TweetSet {
    def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet = ???

  /**
   * The following methods are already implemented
   */

  def contains(tweet: Tweet): Boolean = false

  def incl(tweet: Tweet): TweetSet = new NonEmpty(tweet, new Empty, new Empty)

  def remove(tweet: Tweet): TweetSet = this

  def foreach(f: Tweet => Unit): Unit = ()
}

class NonEmpty(elem: Tweet, left: TweetSet, right: TweetSet) extends TweetSet {

    def filterAcc(p: Tweet => Boolean, acc: TweetSet): TweetSet = ???


  /**
   * The following methods are already implemented
   */

  def contains(x: Tweet): Boolean =
    if (x.text < elem.text) left.contains(x)
    else if (elem.text < x.text) right.contains(x)
    else true

  def incl(x: Tweet): TweetSet = {
    if (x.text < elem.text) new NonEmpty(elem, left.incl(x), right)
    else if (elem.text < x.text) new NonEmpty(elem, left, right.incl(x))
    else this
  }

  def remove(tw: Tweet): TweetSet =
    if (tw.text < elem.text) new NonEmpty(elem, left.remove(tw), right)
    else if (elem.text < tw.text) new NonEmpty(elem, left, right.remove(tw))
    else left.union(right)

  def foreach(f: Tweet => Unit): Unit = {
    f(elem)
    left.foreach(f)
    right.foreach(f)
  }
}

trait TweetList {
  def head: Tweet
  def tail: TweetList
  def isEmpty: Boolean
  def foreach(f: Tweet => Unit): Unit =
    if (!isEmpty) {
      f(head)
      tail.foreach(f)
    }
}

object Nil extends TweetList {
  def head = throw new java.util.NoSuchElementException("head of EmptyList")
  def tail = throw new java.util.NoSuchElementException("tail of EmptyList")
  def isEmpty = true
}

class Cons(val head: Tweet, val tail: TweetList) extends TweetList {
  def isEmpty = false
}


object GoogleVsApple {
  val google = List("android", "Android", "galaxy", "Galaxy", "nexus", "Nexus")
  val apple = List("ios", "iOS", "iphone", "iPhone", "ipad", "iPad")

    lazy val googleTweets: TweetSet = ???
  lazy val appleTweets: TweetSet = ???

  /**
   * A list of all tweets mentioning a keyword from either apple or google,
   * sorted by the number of retweets.
   */
     lazy val trending: TweetList = ???
  }

object Main extends App {
  // Print the trending tweets
  GoogleVsApple.trending foreach println
}

2 个答案:

答案 0 :(得分:2)

我找到了解释here.

基本上当我们做

 left union(right) union(that) incl(elem)

首先left union (right) 已处理,然后处理union(that), 所以我们正在使第二个union左侧的树更大,这将花费更多的时间来完成递归,因为当union的左参数为空时递归结束(检查实现班级union中的Empty

答案 1 :(得分:2)

我发布了here

的解释

以下是其内容:

一些符号:

Root:树的根元素。 左/右:如果我们说一个联合,左/右树,如果我们说&#34;包括左&#34;

:一种。 (左联盟(右联盟(其他包括elem)))的含义

首先:你将当前访问的节点包含在其他节目中(这是探索树,向右下方,然后将你的项添加到其他。不需要在其中调用union)

第二:用正确的子树重复这一步。

第三:你用左子树重复那一步。

全球意义:每次,您将当前的元素添加到其他元素,然后尝试向右移动。如果可以,您可以将正确的元素添加到其他元素,然后再向右移动。然后,你试着离开......你可以吗?再去吧!你不能吗?也不能离开?回溯。

您可以将其视为&#34;优先运动&#34;。每次你添加你的项目,然后根据喜好你走右边,然后离开,然后回去重复!通过这样做,您只需要探索整个树,并将每个节点添加到其他节点!

<强> B中。 ((左联盟右)联盟其他)的含义,包括elem(或左联盟右联盟其他)

哈哈。简而言之,您想要添加当前项目,您可以在其中添加NOW,最后一步可能。但这不是最糟糕的部分。当您致电(左联盟右)时,您现在将左侧项目添加到右侧子树中,采用您之前完成的相同低效方式。这意味着:你尚未将elem包含在其他内容中,但是你必须将left.item包含在右边。然后,因为你将调用(left.left union left.right),必须包括left.left.item到left.right ..每次你做A.union(B),你通过复制删除A项它完全(而不是像incl方法返回的不可变集的智能副本)然后将它添加到B.但是因为删除A的项需要调用A.left.union(A.right),你将首先拥有它复制A.left / A.right ......等等。如果你可以想象一棵树,就像把每个左兄弟收集到它的右兄弟那样,并且每次你只想将一个项目添加到另一个时。

一些注意事项:

如果你可以说一个empty.union(那)=那个,你可以说NonEmpty.union(那个:TweetSet)=如果那个是空的那么那么(((union ...)..)其他包括elem)。这是方法和空/非空模式的问题,你不能将这两个基本案例集中在一个方法中,在这里,我们很多人在空中实现第一个,但忘记了另一个非空。总是确定如果A.f(b)是对称的(= b.f(A)),那么你已经实现了两个基本情况 确定并直接进入基础案例。然后,从它到您的全局解决方案的递归。 for&#34;左联盟右联盟其他包括elem&#34;,基本情况是其他包括elem,因为你不想替换到最后&#34;空包括n1包括n2包括.. &#34 ;.所以直接关注它,(其他包括elem)。 最后,但更重要的是:直觉!使用非常简单的情况,例如,如果你在这里解释有困难,想象一下&#34; copy&#34;您可以写为(左侧版权所有)的方法,包括elem或(左侧副本(右侧包含elem))。通过这样一个简单的例子,您可以更轻松地使用替换,并快速了解为什么某些解决方案比其他解决方案更糟糕! 希望它会有所帮助!如果你有评论,请告诉我!