相同记录集上的不同DISTINCT COUNT结果

时间:2017-08-22 09:52:23

标签: mysql

我有两个几乎相同的“计数”查询的奇怪行为,这些查询应返回相同的结果。

此查询返回正确计数结果: 2006

SELECT COUNT(DISTINCT c0_.campaign_activity_user_id) AS sclr0
FROM campaign_activities c0_
WHERE c0_.campaign_id = 5539 
    AND c0_.campaign_activity_user_id IS NOT NULL 
    AND c0_.campaign_activity_user_id <> ''
    AND c0_.campaign_activity_timestamp >= '2017-07-13 00:00:00' 
    AND c0_.campaign_activity_timestamp <= '2017-08-14 23:59:59';

解释: EXPLAIN

此查询返回不正确的结果: 1490

SELECT count(DISTINCT c0_.campaign_activity_user_id) AS sclr0
FROM campaign_activities c0_
WHERE c0_.campaign_id = 5539
    AND c0_.campaign_activity_user_id IS NOT NULL
    AND c0_.campaign_activity_user_id <> '';

解释: EXPLAIN

我手动计算了行数,第二个查询应该返回与第一个相同的结果( 2006 ),但这不是我不理解的。

似乎如果我向第二个查询添加另一个“AND”条件,该条件应该对结果没有任何影响,它似乎返回正确的计数(2006),例如如果我添加AND c0_.campaign_activity_timestamp is not null;每条记录都有一些时间戳,所以它对计数没有任何影响,但它现在返回正确的结果(2006),这很奇怪。

DDL

CREATE TABLE `campaign_activities` (
  `campaign_activity_id` int(11) NOT NULL AUTO_INCREMENT,
  `campaign_id` int(11) NOT NULL,
  `campaign_link_id` int(11) DEFAULT NULL,
  `campaign_activity_user_id` varchar(30) COLLATE utf8_czech_ci DEFAULT NULL,
  `campaign_activity_timestamp` datetime NOT NULL,
  `campaign_activity_ip` varchar(32) COLLATE utf8_czech_ci DEFAULT NULL,
  `campaign_activity_proxy` varchar(32) COLLATE utf8_czech_ci DEFAULT NULL,
  `campaign_activity_http_referer` text COLLATE utf8_czech_ci,
  `campaign_activity_method` enum('curl','socket') COLLATE utf8_czech_ci DEFAULT NULL,
  `campaign_activity_legitimate` int(11) DEFAULT NULL,
  `campaign_referer_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`campaign_activity_id`,`campaign_id`),
  KEY `fk_reference_35` (`campaign_id`),
  KEY `fk_reference_36` (`campaign_link_id`),
  KEY `fk_reference_37` (`campaign_referer_id`),
  KEY `campaign_activity_user_id` (`campaign_activity_user_id`),
  KEY `campaign_id_campaign_activity_user_id` (`campaign_id`,`campaign_activity_user_id`),
  KEY `campaign_id` (`campaign_id`,`campaign_link_id`,`campaign_activity_user_id`,`campaign_activity_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=28608046 DEFAULT CHARSET=utf8 COLLATE=utf8_czech_ci ROW_FORMAT=COMPACT COMMENT='ex. tabulka kampane_statist'
/*!50100 PARTITION BY KEY (campaign_id)
PARTITIONS 1024 */;

你知道吗,请问哪里有问题? 非常感谢你。

编辑: 忘记了,根据EXPLAIN,第二个查询似乎是自动分组结果,但这对我没有意义(关于事实,没有明确的分组)

0 个答案:

没有答案