Question

假设我需要查询公司的员工。我有一个表“事务”，其中包含每笔交易的数据。

CREATE TABLE `transactions` (
  `transactionID` int(11) unsigned NOT NULL,
  `orderID` int(11) unsigned NOT NULL,
  `customerID` int(11) unsigned NOT NULL,
  `employeeID` int(11) unsigned NOT NULL, 
  `corporationID` int(11) unsigned NOT NULL,
  PRIMARY KEY (`transactionID`),
  KEY `orderID` (`orderID`),
  KEY `customerID` (`customerID`),
  KEY `employeeID` (`employeeID`),
  KEY `corporationID` (`corporationID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

向同事查询此表非常简单，但有一个转折点：每个员工都会注册一次交易记录，因此每个订单可能有多个记录。

例如，如果公司1的员工A和B都参与向公司2销售吸尘器，那么“交易”表中将有两条记录;一个为每个员工，一个为公司1.这不得影响结果。无论涉及多少员工，公司1的交易必须被视为一个。

很简单，我想。我将在派生表上进行连接，如下所示：

SELECT corporationID FROM transactions JOIN (SELECT DISTINCT orderID FROM transactions WHERE corporationID = 1) AS foo USING (orderID)

查询返回一个参与公司1交易的公司列表。这正是我所需要的，但它非常慢，因为MySQL不能使用corporationID索引来确定派生表。我理解MySQL中的所有子查询/派生表都是这种情况。

我还尝试分别查询orderIDs的集合，并使用一个非常大的IN（）子句（从图形上看100 000+个ID），但事实证明MySQL在使用大量IN（）条款的索引时会遇到问题。好的，因此查询时间没有改善。

还有其他选择，还是我都用尽了？

Answer 1

如果我理解你的要求，你可以试试这个。

select distinct t1.corporationID
from transactions t1
where exists (
    select 1
    from transactions t2
    where t2.corporationID =  1
    and t2.orderID = t1.orderID)
and t1.corporationID != 1;

或者这个：

select distinct t1.corporationID
from transactions t1
join transactions t2
on t2.orderID = t1.orderID
and t1.transactionID != t2.transactionID
where t2.corporationID = 1
and t1.corporationID != 1;

Answer 2

您的数据对我来说没有意义，我认为您正在使用corporateID，其中您指的是客户ID，因为您的查询将事务表连接到corporateID = 1的事务表，基于orderID获取公司ID ......那就是1，对吧？

您能指定customerID，employeeID和corporationID的含义吗？我如何知道员工A和B来自公司1 - 在这种情况下，公司1是公司ID，公司2是客户，因此存储在customerID中？

如果是这种情况，您只需按以下方式进行分组：

SELECT customerID
FROM transactions
WHERE corporationID = 1
GROUP BY customerID

（或者如果您希望每个订单有一行而不是每个客户一行，则按orderID选择和分组。）

通过使用group by，您可以忽略除了employeeID之外存在多个重复记录的事实。

相反，退回已出售给公司2的所有公司。

SELECT corporationID
FROM transactions
WHERE customerID = 2
GROUP BY corporationID

使用大型IN（）子句优化MySQL查询或在派生表上连接

2 个答案: