我想使用scala对列值进行分组,例如
sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
overcast,hot,high,FALSE,yes
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
overcast,cool,normal,TRUE,yes
我希望结果为,
对于Ist列.........
Ist group
sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
IInd组
overcast,hot,high,FALSE,yes
overcast,cool,normal,TRUE,yes
IIIrd group
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
对于第二栏.........
Ist group
hot,high,FALSE,no
hot,high,TRUE,no
hot,high,FALSE,yes
IInd组
cool,normal,FALSE,yes
cool,normal,TRUE,yes
IIIrd group
mild,high,FALSE,yes
同样所有列到最后一列............
答案 0 :(得分:2)
使用Seq.groupBy
方法。
val data = Seq(("sunny", "hot", "high", "FALSE", "no"),
("sunny", "hot", "high", "TRUE", "no"),
("overcast", "hot", "high", "FALSE", "yes"),
("rainy", "mild", "high", "FALSE", "yes"),
("rainy", "cool", "normal", "FALSE", "yes"),
("overcast", "cool", "normal", "TRUE", "yes"))
val byFirst = data.groupBy(_._1)
结果:
Map(
overcast -> List((overcast,hot,high,FALSE,yes), (overcast,cool,normal,TRUE,yes)),
rainy -> List((rainy,mild,high,FALSE,yes), (rainy,cool,normal,FALSE,yes)),
sunny -> List((sunny,hot,high,FALSE,no), (sunny,hot,high,TRUE,no)))