我正在处理需要使用reduceBykey函数来聚合键的代码。
// mapToPair代码
JavaPairRDD<String,Integer> taxiPair = taxiData.mapToPair(
x->{
if(!x.isEmpty())
{
String [] split = x.split(",");
x=split[9]; //Extracting Index Value 9
}
return new Tuple2<String,Integer>("Payment:"+x,1);
}
);
List<Tuple2<String,Integer>> sample = taxiPair.take(10);
for(Tuple2<String,Integer> t: sample)
{
System.out.println(t._1+","+t._2);
}
上面的代码结果符合预期。以下是摘要。打印10个值作为样本。
Payment:1,1
Payment:2,1
Payment:1,1
Payment:1,1
Payment:1,1
Payment:1,1
Payment:1,1
Payment:1,1
Payment:1,1
Payment:1,1
按照我的理解,按照上述理解,一旦reduceByKey完成,它应该给出结果:
Payment:1,9
Payment:2,1
但是;
//代码reduceByKey
JavaPairRDD<String,Integer> taxiReduce = taxiPair.reduceByKey(
(x,y)-> (y+y)
);
List<Tuple2<String,Integer>> sample2 = taxiReduce.collect();
for(Tuple2<String,Integer> t: sample2)
{
System.out.println(t._1+","+t._2);
}
//输出:这是来自完整数据集的集合值,但是似乎与期望值不匹配。
Payment:3,2
Payment:2,2
Payment:,2
Payment:4,2
Payment:1,2
答案 0 :(得分:0)
在语句中有误码,此处需要“ x + y”,而不是“ y + y”:
(x,y)-> (y+y)
答案 1 :(得分:0)
def minimumBribes(q):
b = 0
for i, x in enumerate(q):
if x - i > 3:
print('Too chaotic')
return
for y in q[max(0, x - 2):i]:
if y > x:
b += 1
print(b)
应为 JavaPairRDD<String,Integer> taxiReduce = taxiPair.reduceByKey(
(x,y)-> (y+y) );