在scala中合并两个嵌套映射

时间:2015-12-21 10:52:03

标签: scala parsing hashmap

我有一个key -> Map(key1 -> Map(), key2 -> Map())种类表示的嵌套地图,它基本上代表了特定HTTP请求的路径结构。

root/twiki/bin/edit/Main/Double_bounce_sender root/twiki/bin/rdiff/TWiki/NewUserTemplate

我已将它们存储在地图的地图中,这将为我提供路径的层次结构。使用解析器我从服务器日志中读取数据并获取所需的相应数据,然后将数据索引到有序映射中。

val mainList: RDD[List[String]] = requesturl flatMap ( r => r.toString split("\\?") map (x => parser(x.split("/").filter(x => !x.contains("=")).toList).valuesIterator.toList))

def parser(list: List[String]): Map[Int, String]= {
val m = list.zipWithIndex.map(_.swap).toMap
val sM = SortedMap(m.toSeq:_*)
sM.+(0 -> "root")
}

在获取所需结构中的数据后,我遍历整个集合,将数据结构化为路径图,看起来像

root - twiki - bin - edit - Main - Double_bounce_sender -rdiff - TWiki - NewUserTemplate - oops - etc - local - getInterface

type innerMap = mutable.HashMap[String, Any]

def getData(input: RDD[List[String]]): mutable.HashMap[String, innerMap] ={
var mainMap = new mutable.HashMap[String, innerMap]
for(x <- input){
  val z: mutable.HashMap[String, innerMap] = storeData(x.toIterator, mainMap ,x(0).toString)
  mainMap = mainMap ++ z
}
mainMap
}

def storeData(list: Iterator[String], map: mutable.HashMap[String, innerMap], root: String): mutable.HashMap[String, innerMap]={
list.hasNext match {
  case true =>
    val v = list.next()
    val y = map contains (root) match {
      case true =>
        println("Adding when exists: "+v)
        val childMap = map.get(v).get match {
          case _:HashMap[String, Any] => asInstanceOf[mutable.HashMap[String, innerMap]]
          case _ => new mutable.HashMap[String, innerMap]
        }
        val x = map + (v -> storeData(list, childMap, v))
        x
      case false =>
        val x = map + (v -> storeData(list, new mutable.HashMap[String, innerMap], v))
        x
    }
    y.asInstanceOf[mutable.HashMap[String, innerMap]]
  case false =>
    new mutable.HashMap[String, innerMap]
    }
}

get data方法调用每个输入列表并将其发送到构建映射的storeData方法。

我被困在两个地方。

  • 每次以递归方式发送到StoreData的MainMap(HashMap [String,innerMap])作为新的空地图。
  • 第二个问题是我试图找出一种合并2个没有定义长度的嵌套地图的方法。比如合并下面的地图。

Map(root -> Map(twiki -> Map(bin -> Map(edit -> Map(Main -> Map(Double -> Map()))))))) Map(root -> Map(twiki -> Map(bin -> Map(rdiff -> Map(TWiki -> Map(NewUser -> Map())))))))

寻找有关如何实现此解决方案的建议,并在一个地图中获取包含服务器日志文件中存在的所有可能路径的最终地图。

2 个答案:

答案 0 :(得分:2)

要合并这两个地图,您可以使用scalaz和|+|方法

@ Map("root" ->
    Map("twiki" ->
      Map("bin" ->
        Map("rdiff" ->
          Map("TWiki" ->
            Map("NewUser" ->
              Map.empty[String, String]))))))
res2: Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, String]]]]]]] =
  Map("root" ->
    Map("twiki" ->
      Map("bin" ->
        Map("rdiff" ->
          Map("TWiki" ->
            Map("NewUser" -> Map()))))))

@ Map("root" ->
    Map("twiki" ->
      Map("bin" ->
        Map("edit" ->
          Map("Main" ->
            Map("Double" ->  Map.empty[String, String]))))))
res3: Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, String]]]]]]] =
  Map("root" ->
    Map("twiki" ->
      Map("bin" ->
        Map("edit" ->
          Map("Main" ->
            Map("Double" -> Map()))))))

res2 |+| res3
res4: Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, Map[String, String]]]]]]] =
  Map("root" ->
    Map("twiki" ->
      Map("bin" ->
        Map(
          "edit" ->
            Map("Main" ->
              Map("Double" -> Map())),
          "rdiff" ->
            Map("TWiki" ->
              Map("NewUser" -> Map()))))))

答案 1 :(得分:0)

也许是这样的?

scala>   type Node = Map[String, Any];
defined type alias Node

scala>   def merge( me : Node, you : Node ) : Node = {
     |     val keySet = me.keySet ++ you.keySet;
     |     def nodeForKey( parent : Node, key : String ) : Node = parent.getOrElse( key, Map.empty ).asInstanceOf[Node]
     |     keySet.map( key => (key -> merge( nodeForKey( me, key ), nodeForKey( you, key ) ) ) ).toMap
     |   }
merge: (me: Node, you: Node)Node

scala> val path1 = Map( "root" -> Map("bin" -> Map("sh" -> Map.empty) ) )
path1: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,scala.collection.immutable.Map[Nothing,Nothing]]]] = Map(root -> Map(bin -> Map(sh -> Map())))

scala> val path2 = Map( "root" -> Map( "bin" -> Map("csh" -> Map.empty), "usr" -> Map.empty ) )
path2: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,scala.collection.immutable.Map[_ <: String, scala.collection.immutable.Map[Nothing,Nothing]]]] = Map(root -> Map(bin -> Map(csh -> Map()), usr -> Map()))

scala> merge( path1, path2 )
res8: Node = Map(root -> Map(bin -> Map(sh -> Map(), csh -> Map()), usr -> Map()))