我正在构建一个表格,其中显示多个记录包含相同btc
的实例,但对于不同的customer_names
,还显示cost
的最低实例每个客户。
此查询有效 - 但效率非常低,在80,000行表上运行需要一分多钟 - 所以我觉得我一定做错了。
select btc,customer_name,min(cost) from table where table.btc in
(select btc from table group by 1 having count(distinct customer_name) > 1)
group by 1,2
这会输出如下表格:
+---------+---------------+---------+
| btc | customer_name | cost |
+---------+---------------+---------+
| asd32 | Sony | 1.45863 |
| asd32 | Nintendo | 1.84839 |
| bf33940 | Sony | 2.49188 |
| bf33940 | Nintendo | 2.49188 |
| a43c3f | Sony | 2.84142 |
| a43c3f | Nintendo | 2.45 |
| a43c3f | Sega | 2.689 |
+---------+---------------+---------+
我希望更进一步,不要包含cost
两个customer_name
字段相同的任何结果,(所以 - 从中移除btc bf33940
上面的表格,索尼和任天堂有相同的成本)
我也想知道是否有更有效的方法来做我正在做的事情。
+------------------+--------------+------+-----+---------+
| field | type | null | key | default |
+------------------+--------------+------+-----+---------+
| btc | varchar(100) | NO | MUL | NULL |
| mpn | varchar(100) | YES | | NULL |
| supplier | varchar(100) | YES | | NULL |
| invoice | varchar(100) | YES | | NULL |
| invoice_date | datetime | YES | | NULL |
| qtr | varchar(5) | YES | | NULL |
| qty | double(10,0) | YES | | NULL |
| resale | double(15,5) | YES | | NULL |
| ext_resale | double(15,5) | YES | | NULL |
| cost | double(15,5) | YES | | NULL |
| ext_cost | double(15,5) | YES | | NULL |
| gpp | double(15,5) | YES | | NULL |
| project | varchar(100) | YES | | NULL |
| team | double(15,5) | YES | | NULL |
| build_type | varchar(50) | YES | | NULL |
| customer_name | varchar(100) | YES | | NULL |
| customer_address | varchar(100) | YES | | NULL |
| customer_type | varchar(100) | YES | | NULL |
| customer_group | varchar(100) | YES | | NULL |
| sps | varchar(100) | YES | | NULL |
| fps | varchar(100) | YES | | NULL |
| gps | varchar(100) | YES | | NULL |
| hps | varchar(100) | YES | | NULL |
+------------------+--------------+------+-----+---------+
此处的示例CSV文件:https://ufile.io/os0as
答案 0 :(得分:1)
您可以尝试将where...in
替换为join
,但很难说
没有测试它会有多高效。
这样的事情:
select t1.btc, customer_name, min(cost)
from xxx t1
join (
select btc
from xxx
group by btc
having count(*) > 1
) t2 on t1.btc = t2.btc
group by t1.btc, t1.customer_name
对于您的第二个问题,您可以进一步按btc和费用分组以删除重复项:
select t3.btc, group_concat(t3.customer_name), min_cost
from (
select t1.btc, t1.customer_name, min(cost) as min_cost
from xxx t1
join (
select btc
from xxx
group by btc
having count(distinct customer_name) > 1
) t2 on t1.btc = t2.btc
) t3
group by t1.btc, t1.cost
同样,很难说如果没有测试就能开始工作,但希望你能得到这个想法。
为了加快速度,我会为每个btc创建一个单独的表,并计算出有多少客户拥有它的计数器,因此您不需要创建具有count()>的临时表。 1。