如何将csv文件读取为键值对的Map

时间:2018-03-20 15:48:19

标签: scala

我在csv文件中有数据,例如:

value,key
A,Name
B,Name
C,Name
24,Age
25,Age
20,Age
M,Gender
F,Gender

我想解析它以生成以下地图:

Map(Name -> List(A, B, C), Age -> List(24,25,20), Gender -> List(M,F))

5 个答案:

答案 0 :(得分:2)

这是一种可能性:

import scala.io.Source

Source.fromFile("my/path")
  .getLines()
  .drop(1) // Drop the header (first line)
  .map(_.split(",")) // Split by ",": List(Array(A, Name), Array(B, Name), Array(C, Name), ...
  .groupBy(_(1)) // group by value: Map(Age -> List(Array(24, Age), Array(25, Age), Array(20, Age)), ...
  .map{ case (key, values) => (key, values.map(_(0))) } // final format: Map(Age -> List(24, 25, 20), ...

给出:

Map(Age -> List(24, 25, 20), Name -> List(A, B, C), Gender -> List(M, F))

答案 1 :(得分:1)

更多功能方法:

Source.fromFile("file.csv").getLines().drop(1).foldLeft(Map.empty[String, List[String]]){
    (acc, line) ⇒
      val value :: key :: Nil = line.split(",").toList
      acc + (key → (acc.getOrElse(key, List.empty) :+ value))
  }

这给出了:

Map(Name -> List(A, B, C), Age -> List(24, 25, 20), Gender -> List(M, F))

答案 2 :(得分:0)

此代码将提供所需的输出

import scala.io.Source

Source.fromFile("C:\\src\\data.txt").getLines()
            .drop(1).map(_.split(",").toList) // gives each list like this -- List(A, Name)
            .map(x => (x.tail.head -> x.head)).toList // swap key and value places  -- (Name,A)
            .groupBy(_._1) // group by key -- (Age,List((Age,24), (Age,25), (Age,20)))
            .map(x => x._1 -> x._2.map(v => v._2)).toMap // extracting only values part -- Map(Age -> List(24, 25, 20), Name -> List(A, B, C), Gender -> List(M, F))

答案 3 :(得分:0)

如果您不愿意在数据集上多次迭代,这是一个单一的解决方案:

import scala.io.Source

val m = mutable.Map[String, List[String]]().withDefaultValue(List.empty)

Source.fromFile("my/path")
    .getLines()
    .drop(1)
    .map(_.split(","))
    .foreach { case x => m.put(x(1), x(0) :: m(x(1))) }

答案 4 :(得分:0)

播放游戏:

scala> val doc = """A,Name
 | B,Name
 | C,Name
 | 24,Age
 | 25,Age
 | 20,Age
 | M,Gender
 | F,Gender""".stripMargin
doc: String =
A,Name
B,Name
C,Name
24,Age
25,Age
20,Age
M,Gender
F,Gender

scala> doc.split("\\n")
res0: Array[String] = Array(A,Name, B,Name, C,Name, 24,Age, 25,Age, 20,Age, M,Gender, F,Gender)

scala> res0.toList.map{ x => val line = x.split(","); line(1) -> line(0)}
res1: List[(String, String)] = List((Name,A), (Name,B), (Name,C), (Age,24), (Age,25), (Age,20), (Gender,M), (Gender,F))

scala> res1.groupBy(e => e._1)
res4: scala.collection.immutable.Map[String,List[(String, String)]] = Map(Age -> List((Age,24), (Age,25), (Age,20)), Name -> List((Name,A), (Name,B), (Name,C)), Gender -> List((Gender,M), (Gender,F)))

scala> res4.mapValues{x => x.map{case (k,v) => v}} 
res6: scala.collection.immutable.Map[String,List[String]] = Map(Age -> List(24, 25, 20), Name -> List(A, B, C), Gender -> List(M, F))