我有一个表,其中包含来自其他2个数据库(textfile和htmltables)的数据 他们有一个共同点是列order_number,这就是为什么我把它们组合到同一个表中的原因。 dr_ *来自文本文件,来自htmltable
的oi_ *Select * from data;
+--------------+-----------+----------+-----------+-------+
| order_number | dr_amount | dr_speed | oi_amount | oi_up |
+--------------+-----------+----------+-----------+-------+
| 9699 | 10000 | 26000 | NULL | NULL |
| 9699 | 20000 | 47619 | NULL | NULL |
| 10135 | 18000 | 12676 | NULL | NULL |
| 9979 | 25000 | 14286 | NULL | NULL |
| 9699 | NULL | NULL | 4800 | 4 |
| 10135 | NULL | NULL | 8700 | 2 |
| 9979 | NULL | NULL | 3000 | 8 |
+--------------+-----------+----------+-----------+-------+
首先,我必须使用
从表格中选出order_number(使用dr_amount)select order_number, count(*) as c from data where oi_amount IS NOT NULL group by order_number having c<2;
+--------------+---+
| order_number | c |
+--------------+---+
| 9699 | 1 |
| 9979 | 1 |
+--------------+---+
这将删除拥有9699的order_number,因为它分为2行&amp;&amp; dr_amount = IS NOT NULL (从我后来的处理中,如果它是一个重复行,我将无法将特定行与另一行匹配,因此我将所有重复项排除在dr_amount = IS NOT NULL之外)
接下来我想通过将没有重复的order_number(对于相同的order_number的dr_amount)与具有oi_ *值的那些组合来产生这种输出。像这样:
+--------------+-----------+----------+-----------+-------+
| order_number | dr_amount | dr_speed | oi_amount | oi_up |
+--------------+-----------+----------+-----------+-------+
| 10135 | 18000 | 12676 | 8700 | 2 |
| 9979 | 25000 | 14286 | 3000 | 8 |
+--------------+-----------+----------+-----------+-------+
正如你所看到的,order_number 9699被整理出来,因为排序没有重复,第3和第6行被合并,以及第4和第7行。
我正在考虑使用第一个过滤来获取非重复的order_number并将其传递给第二个select-query作为where =的结果但是这给了我一个问题:
select order_number, dr_amount, dr_speed, oi_amount, oi_up from data where order_number=(select order_number, count(*) as c from data where oi_amount IS NOT NULL group by order_number having c<2);
Operand should contain 1 column(s)
我理解为什么会出现错误,而不是如何解决错误。当使用排序重复所需的嵌套选择时,它将返回2列(order_number和count(*)为c)。 那么我如何使用嵌套选择,所以当我将它传递给真正的select时它只包含1列?
祝你好运 尼克拉斯古斯塔夫森
答案 0 :(得分:0)
该上下文中的子查询不能返回两列。您还需要in
而不是=
,因为它可能会返回多行。这可能就是你想要做的事情:
select order_number, dr_amount, dr_speed, oi_amount, oi_up
from data
where order_number in (select order_number
from data
where oi_amount IS NOT NULL
group by order_number
having count(*) < 2
) and
dr_amount is not null;
答案 1 :(得分:0)
感谢提示将计数(*)移到最后。这帮助我把其余部分组合在一起!您建议的查询无法开箱即用。但这给了我想要的结果:
select order_number, max(dr_amount) as dr_amount, max(oi_amount) as oi_amount from data where order_number IN (select order_number from data where oi_amount IS NOT NULL group by order_number having count(*)<2) group by order_number having dr_amount IS NOT NULL && oi_amount IS NOT NULL;
使用此查询,行将合并,因为我希望它们是:) 我删除了dr_speed和oi_up,因为它们并不是真的需要。
/尼克拉斯