将CSV文件读取到OrderedMap

时间:2017-06-12 16:38:02

标签: scala

我正在阅读CSV文件并将数据添加到Scala中的Map

 val br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(fileName)), "UTF-8"))
 val inputFormat = CSVFormat.newFormat(delimiter.charAt(0)).withHeader().withQuote('"')                                
 import scala.collection.JavaConverters._
 import org.apache.commons.csv.{CSVFormat, CSVParser}

  val csvRecords = new CSVParser(br, inputFormat).getRecords.asScala
  val buffer = for (csvRecord <- csvRecords; if csvRecords != null && csvRecords.nonEmpty)
    yield csvRecord.toMap.asScala                              
    buffer.toList                                                  

但由于Map未被排序,我无法按顺序读取列。有没有办法按顺序阅读csvRecords?
CSV文件包含逗号分隔值以及标头。它应该以{{1​​}}格式生成类似List[mutable.LinkedHashMap[String, String]]的输出。

上面的代码正在运行,但它没有保留订单。对于Ex:如果CSV文件包含fname,lname顺序的列,则输出映射首先是lname,最后是fname。

1 个答案:

答案 0 :(得分:1)

如果我理解你的问题是正确的,这里有一种方法可以按顺序创建包含元素的LinkedHashMap列表:

// Assuming your CSV File has the following content:
fname,lname,grade
John,Doe,A
Ann,Cole,B
David,Jones,C
Mike,Duke,D
Jenn,Rivers,E

import collection.mutable.LinkedHashMap

// Get indexed header from CSV
val indexedHeader = io.Source.fromFile("/path/to/csvfile").
  getLines.take(1).next.
  split(",").
  zipWithIndex

indexedHeader: Array[(String, Int)] = Array((fname,0), (lname,1), (grade,2))

// Aggregate LinkedHashMap using foldLeft
val ListOfLHM = for ( csvRecord <- csvRecords ) yield    
  indexedHeader.foldLeft(LinkedHashMap[String, String]())(
    (acc, x) => acc += (x._1 -> csvRecord.get(x._2))
  )

ListOfLHM: scala.collection.mutable.Buffer[scala.collection.mutable.LinkedHashMap[String,String]] = ArrayBuffer(
  Map(fname -> John, lname -> Doe, grade -> A),
  Map(fname -> Ann, lname -> Cole, grade -> B),
  Map(fname -> David, lname -> Jones, grade -> C),
  Map(fname -> Mike, lname -> Duke, grade -> D),
  Map(fname -> Jenn, lname -> Rivers, grade -> E)
)