我有一个数据集,我尝试按其在“支出列”上的降序进行排序,结果如下所示:
+---------+----------+----------------+
| FACTORY | CUSTOMER | EXPEND |
+---------+----------+----------------+
| ABC | JOHN | 147,883,593.00 |
| ABC | DAVE | 91,679,200.00 |
| ABC | PET | 61,424,237.00 |
| ABC | DIN | 18,613,473.00 |
| ABC | INU | 13,593,258.50 |
| DEF | JOHN | 8,438,527.00 |
| DEF | DAVE | 6,804,375.50 |
| DEF | PET | 2,569,754.16 |
| DEF | DIN | 2,540,791.00 |
| DEF | INU | 995,163.00 |
| DEF | PET | 173,020.00 |
+---------+----------+----------------+
但是我想要下面的结果
+---------+----------+----------------+
| FACTORY | CUSTOMER | EXPEND |
+---------+----------+----------------+
| ABC | JOHN | 147,883,593.00 |
| DEF | JOHN | 8,438,527.00 |
| ABC | DAVE | 91,679,200.00 |
| DEF | DAVE | 6,804,375.50 |
| ABC | PET | 61,424,237.00 |
| DEF | PET | 2,569,754.16 |
| DEF | PET | 173,020.00 |
| ABC | DIN | 18,613,473.00 |
| DEF | DIN | 2,540,791.00 |
| ABC | INU | 13,593,258.50 |
| DEF | INU | 995,163.00 |
+---------+----------+----------------+
根据支出的降序对客户进行分组。
如何实现这一目标。 我这是一个示例,我的数据集可以变得更复杂:(
答案 0 :(得分:1)
可以按正确的顺序将用户提取到新的数据框中,然后与原始用户合并:
val original = Seq(
("ABC", "JOHN", 147883593.00),
("ABC", "DAVE", 91679200.00),
("ABC", "PET", 61424237.00),
("ABC", "DIN", 18613473.00),
("ABC", "INU", 13593258.50),
("DEF", "JOHN", 8438527.00),
("DEF", "DAVE", 6804375.50),
("DEF", "PET", 2569754.16),
("DEF", "DIN", 2540791.00),
("DEF", "INU", 995163.00),
("DEF", "PET", 173020.00)
).toDF(
"FACTORY", "CUSTOMER", "EXPEND"
)
val customersInProperOrder = original
.groupBy("CUSTOMER")
.agg(max("EXPEND").alias("EXPEND"))
.orderBy(desc("EXPEND"))
.drop("EXPEND")
.withColumn("ORDER", monotonically_increasing_id())
val result = original.alias("o")
.join(customersInProperOrder.alias("c"), $"o.CUSTOMER" === $"c.CUSTOMER")
.orderBy($"ORDER", desc("EXPEND"))
.drop($"c.CUSTOMER")
.drop($"c.ORDER")
结果:
+-------+--------+------------+
|FACTORY|CUSTOMER|EXPEND |
+-------+--------+------------+
|ABC |JOHN |1.47883593E8|
|DEF |JOHN |8438527.0 |
|ABC |DAVE |9.16792E7 |
|DEF |DAVE |6804375.5 |
|ABC |PET |6.1424237E7 |
|DEF |PET |2569754.16 |
|DEF |PET |173020.0 |
|ABC |DIN |1.8613473E7 |
|DEF |DIN |2540791.0 |
|ABC |INU |1.35932585E7|
|DEF |INU |995163.0 |
+-------+--------+------------+