我正在尝试读取json文件,以便在scala中计算一些指标。我设法读取了文件并进行了一些外部过滤,但我在理解如何过滤嵌套列表和地图时遇到了麻烦。
这是示例代码(真正的json更长):
val rawData = """[
{
"technology": "C",
"users": [
{
"rating": 5,
"completed": false,
"user": {
"id": 11111,
"paid": true
}
},
{
"rating": 4,
"completed": false,
"user": {
"id": 22222,
"paid": false
}
}
],
"title": "CS50"
},
{
"technology": "C++",
"users": [
{
"rating": 3,
"completed": true,
"user": {
"id": 33333,
"paid": false
}
},
{
"rating": 5,
"completed": true,
"user": {
"id": 44444,
"paid": false
}
}
],
"title": "Introduction to C++"
},
{
"technology": "Haskell",
"users": [
{
"rating": 5,
"completed": false,
"user": {
"id": 55555,
"paid": false
}
},
{
"rating": null,
"completed": true,
"user": {
"id": 66666,
"paid": false
}
}
],
"title": "Course on Haskell"
}
]"""
val data = rawData.toString.split("\n").toSeq.map(_.trim).filter(_ != "").mkString("")
我设法得到包含3个标题的列表:
import scala.util.parsing.json._
val parsedData = JSON.parseFull(data)
val listTitles = parsedData.get.asInstanceOf[List[Map[String, Any]]].map( { case e: Map[String, Any] => e("title").toString } )
这是我的三个问题:
预先感谢您的帮助
答案 0 :(得分:1)
作为另一个答案的建议,您应该使用play-json库。它功能强大,并具有大量功能,包括对象映射,解析和错误处理。
import play.api.libs.json._
import play.api.data.validation.ValidationError
case class User(id: String, paid: Boolean)
object User {
implicit val format: OFormat[User] = Json.format[User]
}
case class UserCourseStat(rating: Int, completed: Boolean, user: User)
object UserCourseStat {
implicit val format: OFormat[UserCourseStat] = Json.format[UserCourseStat]
}
case class Data(technology: String, title: String, users: List[UserCourseStat])
object Data {
implicit val format: OFormat[Data] = Json.format[Data]
}
val jsString = """[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]"""
val rowData: JsValue = Json.parse(jsString)
rowData.validate[List[Data]] match {
case JsSuccess(dataList: List[Data], _) =>
val chosenTitles = List("Course on Haskell", "Introduction to C++", "CS50")
//map of each chosen title to sequence of it's users
val chosenTitleToUsersMap = chosenTitles.map { title =>
title -> dataList.filter(_.title == title)
.flatMap(_.users.map(_.user))
.toSet
}.toMap
//map of each chosen title to sequence of it's paid users
val chosenTitleToPaidUsersMap = chosenTitleToUsersMap.map { case (title, users) =>
title -> users.filter(_.paid)
}
//Calculate users who have completed each of the chosen title
val allUsers = dataList.flatMap(_.users.map(_.user)).toSet
val usersWhoCompletedAllChosenTitles = allUsers.filter{ user =>
chosenTitles.forall { title =>
chosenTitleToUsersMap.get(title).flatten.contains(user)
}
}
case JsError(errors: Seq[(JsPath, Seq[ValidationError])]) =>
//handle the error case
???
}
关于您的3个问题:
- 获取这三个书名的列表是一种好方法吗?
我在那里看到2个不安全的操作,asInstanceOf和e(“ title”),后一个是因为未使用Map的.get(key)方法,如果找不到键,它将抛出异常。 >
- 如何获取包含后3个标题中每个标题的付费用户数量的列表?
在上面的名为“ chosenTitleToPaidUsersMap”的值中进行了评估
- 如何获取包含后三个标题中的每一个都已完成课程的用户数的列表?
在上面的名为“ usersWhoCompletedAllChosenTitles”的值中评估
答案 1 :(得分:0)
您可以使用play-json库来解析和检索所需的字段。例如:
import play.api.libs.json.Json
val rawData1 = Json.parse("""[{"technology":"C","users":[{"rating":5,"completed":false,"user":{"id":11111,"paid":true}},{"rating":4,"completed":false,"user":{"id":22222,"paid":false}}],"title":"CS50"},{"technology":"C++","users":[{"rating":3,"completed":true,"user":{"id":33333,"paid":false}},{"rating":5,"completed":true,"user":{"id":44444,"paid":false}}],"title":"Introduction to C++"},{"technology":"Haskell","users":[{"rating":5,"completed":false,"user":{"id":55555,"paid":false}},{"rating":null,"completed":true,"user":{"id":66666,"paid":false}}],"title":"Course on Haskell"}]""")
val resultedList = (rawData1 \\ "title").toList.map(_.as[String])
答案 2 :(得分:0)
我建议您使用json4s库。它允许您将数据提取到案例类中:
import org.json4s.jackson.JsonMethods.parseOpt
import org.json4s.DefaultFormats
implicit val formats = DefaultFormats
case class Tech(technology: String, users: Seq[TechUser], title: String)
case class TechUser(rating: Option[Int], completed: Boolean, user: UserInfo)
case class UserInfo(id: Int, paid: Boolean)
val rawData = """..."""
val Some(json) = parseOpt(rawData)
val Some(data) = json.extractOpt[List[Tech]]
完成此操作后,data
是常规的Scala数据结构,您可以根据需要对其进行操作。例如,如果要查找哪个用户的ID被5整除的用户,则可以这样做:
data.find(_.users.exists(_.user.id % 5 == 0)).map(_.title)
// Result: Some("Course on Haskell")
您对这三个问题的答案就像这样,只是一线而已,但是我作为练习来留给您。