如何使用Scala从解析的文本文件中排除字符串?

时间:2018-01-26 17:34:50

标签: scala

示例文本文件如下所示:

Date:  Nov 12, 2004
Support_Addresses:  Support@microsoft.com, suport@yahoo.com, 
                    google@gmail.com, 
                    support@comcast.net
Notes:  Need to renew support contracts for software and services. 

预期输出为:

Nov 12, 2004
Support@microsoft.com, suport@yahoo.com, google@gmail.com, support@comcast.net 
Need to renew support contracts for software and services. 

基本上,我需要从行中排除字段标题,因此“Date:”,“Support_Addresses:”和“Notes:”之类的内容会在保存到CSV文件之前从行中删除。我有其他项目的代码:

val support_agreements = lines
  .dropWhile(line => !line.startsWith("Support_Addresses: "))
  .takeWhile(line  => !line.startsWith(“Notes: "))
  .flatMap(_.split(","))
  .map(_.trim())
  .filter(_.nonEmpty)
  .mkString(", ")

但它不会删除字段标题/名称。我使用startsWith,但它包含字段名称。如何从行中排除字段名称?

2 个答案:

答案 0 :(得分:1)

这应该这样做:

text.lines.map{ line =>
  line.indexOf(':') match {
    case x if x > 0 =>
      line.substring(x + 1).trim
    case _ => line.trim
  }
}.mkString("\n")

它遍历行,如果找到冒号则调用子字符串函数

答案 1 :(得分:1)

这是我想出的。它构建了一个可以有效操作的数据m地图。然后以您想要的形式打印。

def processValue(s: String): List[String] = 
  s.split(",").toList.map(_.trim).filterNot(_.isEmpty)

val retros = lines.foldLeft(List.empty[(String, List[String])]) {
  case (acc, l) =>
    l.indexOf(':') match {
      case -1 => 
        acc match {
          case Nil => acc // ???
          case h :: t => (h._1, h._2 ++ processValue(l)) :: t
        }
      case n =>
        val key = l.substring(0, n).trim
        val value = processValue(l.substring(n+1))
        (key, value) :: acc
    }
}

val m = retros.reverse.toMap

m.values.map(_.mkString(", ")).foreach(println)