Spark Scala 26035582再次访问

时间:2016-11-10 16:44:45

标签: scala apache-spark

虽然我理解这里的ouctome,但我看不出突出的方面是如何运作的 拜托,启发我

def isHeavy(inp: String) = inp.split(",").map(weights(_)).sum > 12
val input = List("a,b,c,d", "b,c,e", "a,c,d", "e,g") 
val splitSize = 10000 // specify some number of elements that fit in memory.

val numSplits = (input.size / splitSize) + 1 // has to be > 0.
val groups = sc.parallelize(input, numSplits) // specify the # of splits.

val weights = Array(("a", 3), ("b", 2), ("c", 5), ("d", 1), ("e", 9), ("f", 4), ("g", 6)).toMap

def isHeavy(inp: String) = inp.split(",").map(weights(_)).sum > 12
val result = groups.filter(isHeavy)

1 个答案:

答案 0 :(得分:1)

weights是一个以字符串

为键的地图
scala> weights
res13: scala.collection.immutable.Map[String,Int] = Map(e -> 9, f -> 4, a -> 3, b -> 2, g -> 6, c -> 5, d -> 1)

inp.split(",")将拆分字符串,map函数会迭代这些键,将每个键转换为相应键的weights映射值。

下划线是一个scala快捷方式,可以这样编写

inp.split(",").map(x => weights(x))

换句话说,val input = List("a,b,c,d")成为一个数字列表(3,2,5,1),然后对它们进行求和,并筛选出超过12的数字

例如,

scala> input.foreach(x => println(x.split(",").mkString))
abcd
bce
acd
eg

scala> input.foreach(x => println(x.split(",").map(weights(_)).mkString(",")))
3,2,5,1
2,5,9
3,5,1
9,6

scala> input.foreach(x => println(x.split(",").map(weights(_)).sum))
11
16
9
15

scala> input.foreach(x => {
     |     val sum = x.split(",").map(weights(_)).sum
     |     if (sum > 12) println(sum)
     | })
16
15