Scala:groupBy基于布尔值应用于下一个元素

时间:2015-10-03 19:27:14

标签: scala scala-collections scalaz

如果我的列表看起来像这样:

List("abdera.apache.org lists:", "commits", "dev", "user",
"accumulo.apache.org lists:", "commits", "dev", "notifications", "user")

我想以

结尾
Map("abdera.apache.org lists:" -> Seq("commits", "dev", "user"), 
"accumulo.apache.org lists:" -> Seq("commits", "dev", "notifications", "user"))

我该怎么做?

我一直在尝试groupBy,但我不确定如何应用布尔值来首先获取密钥(即string.contains("lists:")),然后使用布尔值来测试下一个元素是否存在不包含“lists:”,因此将其添加为值。

3 个答案:

答案 0 :(得分:2)

假设您的列表结构是

List(key, item, item, item, 
     key, item ..., item, 
     key, item, ...)

您可以使用foldLeft

构建类似的地图
val list = List("abdera.apache.org lists:", "commits", "dev", "user",
  "accumulo.apache.org lists:", "commits", "dev", "notifications", "user")

val map: Map[String, List[String]] =
  list.foldLeft(List.empty[(String, List[String])]) {

    case (acc, curr) if curr.endsWith("lists:") =>
      // identified a list key
      curr -> List.empty[String] :: acc

    case (((headListKey, headList)) :: tail, curr) =>
      // append current string to list of strings of head, until next list key is found
      (headListKey, curr :: headList) :: tail

  }.toMap.mapValues(_.reverse)

如果键字符串并不总是以相同的方式结束,您可能希望使用正则表达式来标识列表中的键字符串。

答案 1 :(得分:1)

使用https://stackoverflow.com/a/21803339/3189923中定义的multiSpan,给定

val xs = List("abdera.apache.org lists:", "commits", "dev", "user",
              "accumulo.apache.org lists:", "commits", "dev",
                                            "notifications", "user")

我们有那个

xs.multiSpan(_.contains("lists:"))

提供列表清单

List(List(abdera.apache.org lists:, commits, dev, user),
     List(accumulo.apache.org lists:, commits, dev, notifications, user))

我们可以将生成的嵌套列表转换为所需的Map,例如,如下所示,

xs.multiSpan(_.contains("lists:")).map( ys => ys.head -> ys.tail ).toMap

答案 2 :(得分:0)

再次假设结构总是如上所述:

val list = List("abdera.apache.org lists:", "commits", "dev", "user",
  "accumulo.apache.org lists:", "commits", "dev", "notifications", "user")

Map(list.grouped(4).map(l => (l.head -> l.tail)).toList : _*)

如果您坚持要Seq,那么您可以改为l.tail.toSeq