我正在关注Pascal Bugnion的书Scala for Data Science的代码。 表示交易的第一类
case class Transaction(
id:Option[Int], // unique identifier
candidate:String, // candidate receiving the donation
contributor:String, // name of the contributor
contributorState:String, // contributor state
contributorOccupation:Option[String], // contributor job
amount:Long, // amount in cents
date:Date // date of the donation
)
defined class Transaction
然后我在FEData单例对象的帮助下加载了数据
scala> val ohioData = FECData.loadOhio
ohioData: FECData = FECData@7e83a375
FECData对象具有属性事务
scala> val ohioTransactions = ohioData.transactions
ohioTransactions: Iterator[Transaction] = non-empty iterator
当我尝试打印前5个交易时
scala> ohioTransactions.take(5).foreach(println)
java.text.ParseException: Unparseable date: "06-DEC-11"
at java.text.DateFormat.parse(DateFormat.java:366)
at FECData$$anonfun$1.apply(FECData.scala:26)
at FECData$$anonfun$1.apply(FECData.scala:16)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:370)
让我们来看看csv文件的前5行 candidate_id,候选,contributor_name,contributor_state,contributor_occupation,金额,日期
P80000748,"Paul, Ron","BROWN, TODD W MR.",OH,ENGINEER,50.0,06-DEC-11
P80000748,"Paul, Ron","DIEHL, MARGO SONJA",OH,RETIRED,25.0,06-DEC-11
P80000748,"Paul, Ron","KIRCHMEYER, BENJAMIN",OH,COMPUTER PROGRAMMER,201.2,06-DEC-11
P80003338,"Obama, Barack","KEYES, STEPHEN",OH,HR EXECUTIVE / ATTORNEY,100.0,30-SEP-11
P80003338,"Obama, Barack","MURPHY, MIKE W",OH,MANAGER,50.0,26-SEP-11
为什么?
答案 0 :(得分:3)
好的,问题是在FECData
中将dateParser
定义为new SimpleDateFormat("DD-MMM-YY")
。
根据https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#SimpleDateFormat(java.lang.String),它使用给定的模式构建SimpleDateFormat
,默认区域设置的默认日期格式符号 。
问题是您的(JVM的)默认语言环境不是Locale.ENGLISH
,因此DEC
的{{1}}部分未正确解析。
您只需要修补"06-DEC-11"
:将FECData
替换为private val dateParser = new SimpleDateFormat("DD-MMM-YY")
。
参考。 private val dateParser = new SimpleDateFormat("DD-MMM-YY", java.util.Locale.ENGLISH)
https://docs.oracle.com/javase/7/docs/api/java/util/Locale.html