我有一个scala数组" visitedArray" ,其值如下:
Array(
(Map(url -> http://www.tumblr.com/tagged/abc), Map(visited -> true)),
(Map(url -> http://www.tumblr.com/tagged/random-blog), Map(visited -> true)),
(Map(url -> http://www.livestream.com/forum/1),Map(visited -> false))
....
但是,我想将其转换为String,Map [String,Any],并希望结果显示为:
(
(http://www.tumblr.com/tagged/kate-beckett, Map(visited -> true),
(http://www.tumblr.com/tagged/random-blog), Map(visited -> true)
....
我试过了:
val testRdd = sc.parallelize(visitedArray)
val formatedRdd = testRdd.map(t => (t._1("url"), t._2))
但是,它不会返回所需的格式。它返回:
Array(
(http://www.tumblr.com/tagged/kate-beckett, Map(visited -> true),
(http://www.tumblr.com/tagged/random-blog), Map(visited -> true)
....
如何在不使用数组()的情况下实现我想要的效果(转换为String,Map [String,Any]?
答案 0 :(得分:0)
如果我理解正确,你想要这个
val a = Array(
(Map("url" -> "http://www.tumblr.com/tagged/abc"), Map("visited" -> true)),
(Map("url" -> "http://www.tumblr.com/tagged/random-blog"), Map("visited" -> true)),
(Map("url" -> "http://www.livestream.com/forum/1"),Map("visited" -> false)))
a.map {
case (m1: Map[String, String], m2: Map[String, Boolean]) =>
(m1("url"), m2)
}
这导致了这个
Array(
("http://www.tumblr.com/tagged/abc", Map("visited" -> true)),
("http://www.tumblr.com/tagged/random-blog", Map("visited" -> true)),
("http://www.livestream.com/forum/1", Map("visited" -> false))
): Array[(String, Map[String, Boolean])]
然后你可以sc.parallelize
那个
您只在开头看到Array
,因为这是Scala打印对象的方式。它实际上并不是数据的一部分"
例如,使用List
a.map {
case (m1: Map[String, String], m2: Map[String, Boolean]) =>
(m1("url"), m2)
} toList
List(
("http://www.tumblr.com/tagged/abc", Map("visited" -> true)),
("http://www.tumblr.com/tagged/random-blog", Map("visited" -> true)),
("http://www.livestream.com/forum/1", Map("visited" -> false))
): scala.package.List[(String, Map[String, Boolean])]