删除可变的sum var

时间:2016-08-25 20:08:41

标签: scala entropy

以下是基于Jeff Atwood答案的熵计算:How to calculate the entropy of a file?基于http://en.wikipedia.org/wiki/Entropy_(information_theory)

object MeasureEntropy extends App {

  val s = "measure measure here measure measure measure"

  def entropyValue(s: String) = {

    val m = s.split(" ").toList.groupBy((word: String) => word).mapValues(_.length.toDouble)
    var result: Double = 0.0;
    val len = s.split(" ").length;

    m map {
      case (key, value: Double) =>
        {
          var frequency: Double = value / len;
          result -= frequency * (scala.math.log(frequency) / scala.math.log(2));
        }
    }

    result;
  }

  println(entropyValue(s))
}

我希望通过删除与以下内容相关的可变状态来改进这一点:

var result: Double = 0.0;

如何将result合并到map函数的单个计算中?

3 个答案:

答案 0 :(得分:1)

使用foldLeft,或者在这种情况下/:,这是一个语法糖:

(0d /: m) {case (result, (key,value)) => 
  val frequency = value / len
  result - frequency * (scala.math.log(frequency) / scala.math.log(2))
}

文档:http://www.scala-lang.org/files/archive/api/current/index.html#scala.collection.immutable.Map@/:B(操作:(B,A)=> B):B

答案 1 :(得分:1)

一个简单的sum可以解决问题:

m.map {
  case (key, value: Double) =>
     val frequency: Double = value / len;
      - frequency * (scala.math.log(frequency) / scala.math.log(2));
}.sum

答案 2 :(得分:1)

可以使用如下所示的foldLeft编写。

  def entropyValue(s: String) = {
    val m = s.split(" ").toList.groupBy((word: String) => word).mapValues(_.length.toDouble)
    val len = s.split(" ").length
    m.foldLeft(0.0)((r, t) => r - ((t._2 / len) * (scala.math.log(t._2 / len) / scala.math.log(2))))
  }