我有以下数据:
group_id id name
---- -- ----
G1 1 apple
G1 2 orange
G1 3 apple
G1 4 banana
G1 5 apple
G2 6 orange
G2 7 apple
G2 8 apple
G3 7 banana
G3 8 orange
我想将每个组的1条随机记录更新为1,其余所有内容都应为零,如下所示:
group_id id name random_pick
---- -- ---- -------------------
G1 1 apple 0
G1 2 orange 0
G1 3 apple 0
G1 4 banana 0
G1 5 apple 1
G2 6 orange 0
G2 7 apple 1
G2 8 apple 0
G3 7 banana 0
G3 8 orange 1
我的想法:
但是在斯卡拉如何? :(
谢谢!
答案 0 :(得分:1)
怎么样……...
case class MyRow(group_id: Int, id: Int, name: String, randomPick: Boolean = false)
val randomPicks = myData.groupBy(_.groupId).toList.flatMap{
case (_, l) =>
val h :: t = scala.util.Random.shuffle(l)
h.copy(randomPick = true) :: t
}
答案 1 :(得分:0)
比@TerryDactyl更详细
case class Tup(groupId: String,
id: Int,
name: String,
randomPick: Boolean = false)
val ts = Seq(
Tup("G1", 1, "apple"),
Tup("G1", 2, "orange"),
Tup("G1", 3, "apple"),
Tup("G1", 4, "banana"),
Tup("G1", 5, "apple"),
Tup("G2", 6, "orange"),
Tup("G2", 7, "apple"),
Tup("G2", 8, "apple"),
Tup("G3", 7, "banana"),
Tup("G3", 8, "orange")
)
val grouped = ts.groupBy(_.groupId)
val withChosen = grouped.map{case (_, ts) =>
val l = ts.length
val i = scala.util.Random.nextInt(l)
ts.zipWithIndex.map{ case (tup, idx) =>
if (idx == i) tup.copy(randomPick = true)
else tup
}
}