Question

我想避免冗余索引，那么这两个查询的最佳复合索引是什么？根据我的理解，这两个查询不能有相同的复合索引，因为一个需要country，另一个需要product_id，但是如果我做如下索引，会不会是冗余索引，影响数据库性能？

结合merchant_id、created_at 和product_id
结合merchant_id、created_at 和country

查询 1

SELECT * from shop_order 
WHERE shop_order.merchant_id = ? 
AND shop_order.created_at >= TIMESTAMP(?) 
AND shop_order.created_at <= TIMESTAMP(?) 
AND shop_order.product_id = ?) AS mytable 
WHERE product_id IS NOT NULL GROUP BY product_id, title;

查询 2

SELECT COALESCE(SUM(total_price_usd),0) AS revenue, 
COUNT(*) as total_order, COALESCE(province, 'Unknown') AS name 
FROM shop_order 
WHERE DATE(created_at) >= '2021-02-08 13:37:42'
AND DATE(created_at) <= '2021-02-14 22:44:13'
AND merchant_id IN (18,19,20,1) 
AND country = 'Malaysia' GROUP BY province;

表结构

CREATE TABLE `shop_order` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `merchant_id` bigint(20) DEFAULT NULL,
  `order_id` bigint(20) NOT NULL,
  `customer_id` bigint(20) DEFAULT NULL,
  `customer_orders_count` varchar(45) DEFAULT NULL,
  `customer_total_spent` varchar(45) DEFAULT NULL,
  `customer_email` varchar(100) DEFAULT NULL,
  `customer_last_order_name` varchar(45) DEFAULT NULL,
  `currency` varchar(10) NOT NULL,
  `total_price` decimal(20,8) NOT NULL,
  `subtotal_price` decimal(20,8) NOT NULL,
  `transaction_fee` decimal(20,8) DEFAULT NULL,
  `total_discount` decimal(20,8) DEFAULT '0.00000000',
  `shipping_fee` decimal(20,8) DEFAULT '0.00000000',
  `total_price_usd` decimal(20,8) DEFAULT NULL,
  `transaction_fee_usd` decimal(20,8) DEFAULT NULL,
  `country` varchar(50) DEFAULT NULL,
  `province` varchar(45) DEFAULT NULL,
  `processed_at` datetime DEFAULT NULL,
  `refunds` json DEFAULT NULL,
  `ffm_status` varchar(50) DEFAULT NULL,
  `gateway` varchar(45) DEFAULT NULL,
  `confirmed` tinyint(1) DEFAULT NULL,
  `cancelled_at` datetime DEFAULT NULL,
  `cancel_reason` varchar(100) DEFAULT NULL,
  `created` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `order_number` bigint(1) DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `financial_status` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `shop_order_unique` (`merchant_id`,`order_id`),
  KEY `merchant_id` (`merchant_id`),
  KEY `combine_idx1` (`country`,`merchant_id`,`created_at`)
) ENGINE=InnoDB AUTO_INCREMENT=2237 DEFAULT CHARSET=utf8mb4;

请帮帮我

Answer 1

查询 1：

INDEX(merchant_id, product_id,  -- Put columns for "=" tests first (in any order)
      created_at)               -- then range

查询 2。首先，避免在函数调用 (created_at) 中隐藏 DATE()；它阻止在索引中使用它。

INDEX(country,       -- "="
      merchant_id,   -- IN
      created_at)    -- range (after removing DATE)

您说这些查询需要单独的索引是正确的。其他查询可能需要您现有的一些索引。

此外，您已经有了一个冗余索引。 Drop KEY merchant_id (merchant_id), -- 你已经吃了至少一个其他索引 starting 与 merchant_id。

拥有额外的索引只是轻微的性能拖累。并且命中是在 INSERT 期间，或者如果您 UPDATE 索引中的任何列。通常，对 SELECT 使用“正确”索引的好处大于对写入的命中。

拥有多个唯一索引有点负担。你真的需要 id ，因为你有一个由这两列组成的“自然”PK？检查其他表是否需要在 id 上加入。

考虑缩小许多数据大小。 BIGINT 占用 8 个字节，并且具有很少需要的范围。 decimal(20,8) 占用 10 个字节并允许高达一万亿美元；这似乎也太过分了。 customer_orders_count 是数字吗？

这两个查询的最佳索引是什么？

1 个答案: