如何解析文件,根据需要将其拆分并将其存储在scala中的列表中

时间:2015-02-12 10:47:39

标签: arrays list file scala

我是scala的初学者。我现在要做的就是解析一个内容有点像这样的文件

  You can't legislate morality... but it seems that morons can be  legislators.

Lottery: A tax on people who are bad at math.

These are questions for action, not speculation, which is idle.
-- Noam Chomsky

If you think education is expensive try Ignorance.
-- Derek Bok, president of Harvard

Photons have neither morals nor visas.
-- Dave Farber

Maturity is not a factor of the games we play but the occasions we play them!

Design a system an idiot can use and only an idiot will want to use it.

It is much more rewarding to do more with less.
-- Donald Knuth

我出现了,直到这个

import scala.io._


object parseFile {
  var sample : Array[String]= new Array[String](20)
  var anyName = List[Map[String,String]]()
  def main(args:Array[String]):Unit = {
    println("Hello, Scala !! ")
  for(line <- Source.fromFile("myFile.txt").getLines())
    //sample  = line.split("--")
    anyName = Map("quote" -> line):: anyName        
    println(anyName)        

 }
}

每一行都会单独在列表中“引用”以及作者名称作为单独的行,但我希望列表中的另一个条目为“作者”,该行应该以“ - ”开头,并且应该拆分它

基本上我想将引文和作者分开并将其保存在列表中。

提前致谢

2 个答案:

答案 0 :(得分:0)

我认为你不需要地图列表,但可能是(引用,...),(作者,......)元组的列表。

这可以通过以下方式实现:

import scala.io._

object Quotes {

    def main(args:Array[String]):Unit = {

       val result = Source.fromFile("myFile.txt")     // read file
                          .getLines()                 // line by line
                          .map(_.trim)                // trim spaces on ends
                          .filter(! _.isEmpty)        // ignore empty lines
                          .map { line =>              

              if (line.startsWith("--")) 
                 "author" -> line.drop(2).trim
              else 
                 "quote" -> line
       }

       // result is an iterator of (String, String) tuple.

       println (result.mkString("\n"))

       // if you want a list of such tuples: result.toList
    }
}

/* output looks like this:
(quote,You can't legislate morality... but it seems that morons can be  legislators.)
(quote,Lottery: A tax on people who are bad at math.)
(quote,These are questions for action, not speculation, which is idle.)
(author,Noam Chomsky)
(quote,If you think education is expensive try Ignorance.)
(author,Derek Bok, president of Harvard)
(quote,Photons have neither morals nor visas.)
(author,Dave Farber)
(quote,Maturity is not a factor of the games we play but the occasions we play them!)
(quote,Design a system an idiot can use and only an idiot will want to use it.)
(quote,It is much more rewarding to do more with less.)
(author,Donald Knuth)
*/

答案 1 :(得分:0)

使用multiSpan对集合进行多次拆分(将文本文件的行划分为List);需要的是每一行上的这些标准,它们用空行和作者来区分引用,

def p(line: String) = {
  val tline = line.trim
  tline.nonEmpty && !tline.startsWith("--")
}

因此

val lines = io.Source.fromFile("myFile.txt").
                      getLines.
                      toList.
                      multiSpan(p)

带有List[List[String]]引号与未知作者(空第二项),例如

List("Lottery: A tax on people who are bad at math.", ""), 

并使用引号和属性作者,例如

List("It is much more rewarding to do more with less.", "-- Donald Knuth")

请注意,条件适用于文本文件中的(猜测)格式,但p可能会调整为其他格式。

为了从作者中引用Map,请考虑这个

(for ( List(a,b,_*) <- lines ) yield a -> b).toMap