嵌套地图最内层地图上的mapValues

时间:2011-05-01 03:23:33

标签: scala

当我试图回答this one时,这个问题的灵感来了。

假设您有一系列数据(例如,可能来自CSV文件)。 groupBy可用于分析数据的某些方面,按列或列组合进行分组。例如:

val groups0: Map[String, Array[String]] = 
  seq.groupBy(row => row(0) + "-" + row(4))

如果我想在我可以做的组中创建子组

val groups1: Map[String, Map[String, Array[String]]] = 
  groups0.mapValues(row => row.groupBy(_(1))

如果我想再这样做一次真的很麻烦:

val groups2 = 
  groups1.mapValues(groups => groups.mapValues(row => row.groupBy(_(2)))

所以这是我的问题给定Map[K0, Map[K1, ..., Map[Kn, V]]]的任意嵌套,你如何编写一个mapValues函数,它接受f: (V) => B并应用于最内层V返回一个Map[K0, Map[K1, ..., Map[Kn, B]]]

2 个答案:

答案 0 :(得分:5)

我的第一直觉说,以类型安全的方式处理任意嵌套是不可能的,但似乎有可能定义一些暗示告诉编译器如何做到这一点。
本质上,“简单”映射器告诉它如何处理普通的非嵌套情况,而“wrappedMapper”告诉它如何向下钻取一个Map层:

  // trait to tell us how to map inside of a container.
  trait CanMapInner[WrappedV, WrappedB,V,B] {
    def mapInner(in: WrappedV, f: V => B): WrappedB
  }

  // simple base case (no nesting involved).
  implicit def getSimpleMapper[V,B] = new CanMapInner[V,B,V,B] {
    def mapInner(in: V, f: (V) => B): B = f(in)
  }

  // drill down one level of "Map".
  implicit def wrappedMapper[K,V,B,InnerV,InnerB]
    (implicit innerMapper: CanMapInner[InnerV,InnerB,V,B]) =
    new CanMapInner[Map[K,InnerV], Map[K,InnerB],V,B] {
      def mapInner(in: Map[K, InnerV], f: (V) => B): Map[K, InnerB] =
        in.mapValues(innerMapper.mapInner(_, f))
    }

  // the actual implementation.
  def deepMapValues[K,V,B,WrappedV,WrappedB](map: Map[K,WrappedV], f: V => B)
      (implicit mapper: CanMapInner[WrappedV,WrappedB,V,B]) = {
    map.mapValues(inner => mapper.mapInner(inner, f))
  }

  // testing with a simple map
  {
    val initMap = Map(1 -> "Hello", 2 -> "Goodbye")
    val newMap = deepMapValues(initMap, (s: String) => s.length)
    println(newMap) // Map(1 -> 5, 2 -> 7)
  }

  // testing with a nested map
  {
    val initMap = Map(1 -> Map("Hi" -> "Hello"), 2 -> Map("Bye" -> "Goodbye"))
    val newMap = deepMapValues(initMap, (s: String) => s.length)
    println(newMap) // Map(1 -> Map(Hi -> 5), 2 -> Map(Bye -> 7))
  }

当然,在实际代码中,由于其简单性,模式匹配动态解决方案非常诱人。类型安全不是一切:)

答案 1 :(得分:3)

我确信使用Manifest有更好的方法,但模式匹配似乎可以区分SeqMap,所以这里是:

object Foo {
  def mapValues[A <: Map[_, _], C, D](map: A)(f: C => D): Map[_, _] = map.mapValues {
    case seq: Seq[C] => seq.groupBy(f)
    case innerMap: Map[_, _] => mapValues(innerMap)(f)
  }
}

scala> val group0 = List("fooo", "bar", "foo") groupBy (_(0))
group0: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map((f,List(fooo, foo)), (b,List(bar)))

scala> val group1 = Foo.mapValues(group0)((x: String) => x(1))
group1: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> List(fooo, foo))), (b,Map(a -> List(bar))))

scala> val group2 = Foo.mapValues(group1)((x: String) => x(2))
group2: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> Map(o -> List(fooo, foo)))), (b,Map(a -> Map(r -> List(bar)))))

修改: 这是使用高级类型的打字版本。

trait NestedMapValue[Z] {
  type Next[X] <: NestedMapValue[Z]
  def nextValues[D](f: Z => D): Next[D]
}

trait NestedMap[Z, A, B <: NestedMapValue[Z]] extends NestedMapValue[Z] { self =>
  type Next[D] = NestedMap[Z, A, B#Next[D]]

  val map: Map[A, B]
  def nextValues[D](f: Z => D): Next[D] = self.mapValues(f)

  def mapValues[D](f: Z => D): NestedMap[Z, A, B#Next[D]] = new NestedMap[Z, A, B#Next[D]] { val map = self.map.mapValues {
    case x: B => x.nextValues[D](f)
  }}

  override def toString = "NestedMap(%s)" format (map.toString)
}

trait Bottom[A] extends NestedMapValue[A] {
  type Next[D] = NestedMap[A, D, Bottom[A]]

  val seq: Seq[A]
  def nextValues[D](f: A => D): Next[D] = seq match {
    case seq: Seq[A] => groupBy[D](f)
  }

  def groupBy[D](f: A => D): Next[D] = seq match {
    case seq: Seq[A] => 
      new NestedMap[A, D, Bottom[A]] { val map = seq.groupBy(f).map { case (key, value) => (key, new Bottom[A] { val seq = value })} }  
  }

  override def toString = "Bottom(%s)" format (seq.toString) 
}

object Bottom {
  def apply[A](aSeq: Seq[A]) = new Bottom[A] { val seq = aSeq }
}

scala> val group0 = Bottom(List("fooo", "bar", "foo")).groupBy(x => x(0))
group0: NestedMap[java.lang.String,Char,Bottom[java.lang.String]] = NestedMap(Map(f -> Bottom(List(fooo, foo)), b -> Bottom(List(bar))))

scala> val group1 = group0.mapValues(x => x(1))
group1: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]] = NestedMap(Map(f -> NestedMap(Map(o -> Bottom(List(fooo, foo)))), b -> NestedMap(Map(a -> Bottom(List(bar))))))

scala> val group2 = group1.mapValues(x => x.size)
group2: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]#Next[Int]] = NestedMap(Map(f -> NestedMap(Map(o -> NestedMap(Map(4 -> Bottom(List(fooo)), 3 -> Bottom(List(foo)))))), b -> NestedMap(Map(a -> NestedMap(Map(3 -> Bottom(List(bar))))))))