如何在Scala字符串中找到ID的出现

时间:2019-07-19 15:34:50

标签: string algorithm scala grouping

大家好,我是scala的新手,我遇到了这个问题:

在以下字符串上:

The ID 5d27e5d082c272591e25b8d5 is the only valid Field, 
The ID 5d27e5d06a77457139395318 is the only valid Field,
The ID 5d27e5d0431e726aeb5ab84f is the only valid Field,
The ID 5d27e5d282c27256cc24b6a2 is the only valid Field,
The ID 5d27e5d282c27256cc24b6a2 is the only valid Field,
The ID 5d27e5d282c2727ad524c567 is the only valid Field,
The ID 5d27e5d2431e724af25a1bd6 is the only valid Field,
The ID 5d27e5d36a774507723a7ea2 is the only valid Field, 
The ID 5d27e5d36a774507723a7ea2 is the only valid Field, 
The ID 5d27e5d482c2727ad524c576 is the only valid Field, 
The ID 5d27e5d482c272591e25b8ee is the only valid Field, 
The ID 5d27e5d482c2727ad524c576 is the only valid Field, 
The ID 5d27e5d482c2727ad524c576 is the only valid Field

我有一组通过验证过程的ID。

如何使用这样的ID分组创建新的字符串:

The id 5d27e5d282c27256cc24b6a2 has 4 errors
The id 5d27e5d482c2727ad524c576 has 2 errors
....

对此进行了尝试,但我认为有一种更好的方法可以实现这一目标

val replaced = input.replaceAll("The ID","").replaceAll("is the only valid Field","").trim.split(",").map(_.trim).groupBy(l => l).map(t => (t._1, t._2.length))

var newMessage = ""
replaced.foreach(s => {
  newMessage += s"The ID ${s._1} the only valid field on ${s._2.toString} rows, "
})

谢谢!

4 个答案:

答案 0 :(得分:1)

这是一个简单的解决方案:

    @service.access_token = api_response["access_token"]
    @service.expires_at = Time.at(api_response["credentials"]["expires_at"]) if api_response["credentials"]["expires_at"].present?
    @service.save if @service.changed?

(for{ line <- all // each element of the list _::_::id::_ = line.split(" ").toList // split the line so you have the 'words' } yield id) // return the ids .groupBy(identity) // group it .map { case (id, list) => s"The id $id has ${list.size} errors" } // return the new Strings 与列表匹配。每个元素都用_::_::id::_分隔。最后的::引用列表的其余部分。之所以使用_是因为您不需要它们。

在控制台中:

_

答案 1 :(得分:1)

这是一种方法的粗略概述。

str.split("\n")
   .groupBy(s => "ID ([^ ]+)".r.findFirstMatchIn(s).fold("none")(_.group(1)))
   .map{case (k,v) => s"ID $k has ${v.length} hits"}
   .mkString("\n")
//res0: String =
//ID 5d27e5d282c27256cc24b6a2 has 2 hits
//ID 5d27e5d06a77457139395318 has 1 hits
//ID 5d27e5d482c2727ad524c576 has 3 hits
//ID 5d27e5d082c272591e25b8d5 has 1 hits
//ID 5d27e5d482c272591e25b8ee has 1 hits
//ID 5d27e5d0431e726aeb5ab84f has 1 hits
//ID 5d27e5d2431e724af25a1bd6 has 1 hits
//ID 5d27e5d282c2727ad524c567 has 1 hits
//ID 5d27e5d36a774507723a7ea2 has 2 hits

答案 2 :(得分:1)

这是仅适用于Scala 2.13的一种替代方法,因为它使用了新的 String Interpolator Extractor
(其中input是包含示例输入的字符串)

def getId(line: String): String = line match {
  case s"${_}The ID ${id} is the only valid Field${_}"=> id
}

val lines = input.split("\n")

val idsGrouped = 
  lines
    .filter(line => line.trim.nonEmpty)
    .groupBy(getId)
    .map {
      case (id, group) => id -> group.size
    }

val newMessage = idsGrouped.map {
  case (id, count) => s"The id ${id} has ${count} errors"
}.mkString("\n")

println(newMessage)
  

ID 5d27e5d282c27256cc24b6a2有2个错误
  ID 5d27e5d06a77457139395318有1个错误
  ID 5d27e5d482c2727ad524c576有3个错误
  ID 5d27e5d082c272591e25b8d5有1个错误
  ID 5d27e5d482c272591e25b8ee有1个错误
  ID 5d27e5d0431e726aeb5ab84f有1个错误
  ID 5d27e5d2431e724af25a1bd6有1个错误
  ID 5d27e5d282c2727ad524c567有1个错误
  ID 5d27e5d36a774507723a7ea2有2个错误


请注意,如果您现在在REPL中尝试此操作,则会出现异常。这是knowfixed的错误。
但是在编译后的代码上,它运行没有问题。

答案 3 :(得分:1)

这与其他响应有点类似,但是具有安全的模式匹配:

  val LineRegEx = "The ID (.+) is the only valid Field,?".r

  val output = 
    input
      .split('\n')
      .collect {
        case LineRegEx(id) => id
      }
      .groupBy(identity)
      .map { case (id, rows) => 
        s"The ID $id the only valid field on ${rows.length} rows"
      }
      .mkString("\n")