EmpNumber City Total Sales
----------------------------------------------------------------------------------
1811 Boston $14557260.03
1803 Chichago $18266965.58
我的桌子看起来像这样。我能知道如何根据他们的销售额从特定城市中找到最佳员工吗?
所需的输出:
select employeenumber, city, max(TotalSales)
from(
select employeenumber, a.city, sum(quantityordered*priceeach) as TotalSales
from offices a, employees b, customers c, orders d, orderdetails e
where a.officeCode = b.officeCode
and b.employeenumber = c.salesrepemployeenumber
and c.customernumber = d.customernumber
and d.ordernumber = e.ordernumber
group by employeenumber, a.city
order by a.city)
group by employeenumber, city;
我尝试过
bind_rows()
但是我仍然从波士顿获得3名员工,从Chichago获得3名员工。我想要的只是每个城市的一名员工。谢谢
答案 0 :(得分:0)
只需使用row_number()
分析函数:
0).
nodeID id_0 |
nodeName Test 3 |
parentID id_0
********************************************
1).
nodeID id_72864 |
nodeName Element A |
parentID id_0
********************************************
2).
nodeID id_72865 |
nodeName Element B |
parentID id_0
********************************************
3).
nodeID id_72866 |
nodeName Element C |
parentID id_0
********************************************
4).
nodeID id_72867 |
nodeName Element D |
parentID id_0
********************************************
5).
nodeID id_72868 |
nodeName Element E |
parentID id_0
********************************************
6).
nodeID id_72869 |
nodeName Element 1 |
parentID id_72864
********************************************
7).
nodeID id_72870 |
nodeName Element 2 |
parentID id_72865
********************************************
8).
nodeID id_108185 |
nodeName Element |
parentID id_72865
********************************************
9).
nodeID id_72871 |
nodeName Element 3 |
parentID id_72866
********************************************
10).
nodeID id_72872 |
nodeName Element 4 |
parentID id_72867
********************************************
11).
nodeID id_73527 |
nodeName Element 5 |
parentID id_72868
*********************************************
如果对于TotalSales的最高值发生平局(等于TotalSales),则应将它们包括在结果中,然后将select employeenumber, city, TotalSales
from
(
select employeenumber, a.city, nvl(quantityordered,0)*nvl(priceeach,0) as TotalSales
row_number() over
( partition by o.city order by nvl(quantityordered,0)*nvl(priceeach,0) desc )
as rn
from offices off
join employees e on off.officeCode = e.officeCode
join customers c on e.employeenumber = c.salesrepemployeenumber
join orders ord on c.customernumber = ord.customernumber
join orderdetails odd on ord.ordernumber = odd.ordernumber
)
where rn = 1
替换为dense_rank()
,这是另一个分析函数。
答案 1 :(得分:0)
从上面共享的第一个数据集中创建名为tbl
的临时表后,这将为您提供所需的答案。
select EmpNumber, City, Max_Sales as `Max Sales` from
(select City, max(`Total Sales`) as `Max_Sales`
from tbl group by City) a
left join
(select `Total Sales` as drop_later, EmpNumber from tbl) b
on a.Max_Sales = b.drop_later
这是Spark SQL的输出:
EmpNumber City Max Sales
0 1811 Boston 14557260.03
1 1803 Chichago 18266965.58