我需要生成一些数字,我设计了一个查询来获得我的“客户”所需的结果。 此查询基于包含一百万条记录的表。 我通常使用MariaDB,我得到了~7s的结果。 这个执行时间非常合适,但我希望再次优化以提高我的技能。 经过一些研究,我发现了一些帖子说“MySQL很好,但不是表格> 1M记录,你必须切换其他东西”PostgreSQL已被多次引用。 所以我安装了PostgreSQL,并复制了我的表,索引和数据。 我执行了相同的查询,我的结果是~12s
我知道PostgreSQL较少,我认为我没有使用该语言固有的特性。 所以现在我留在MariaDB。你有想法改善执行时间吗?
这是我的查询:
select categorie.cat
,dhu_type.type
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2013-01-01' and '2013-12-31'
THEN dhu.id
END )
) AS "2013"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2014-01-01' and '2014-12-31'
THEN dhu.id
END )
) AS "2014"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2015-01-01' and '2015-12-31'
THEN dhu.id
END )
) AS "2015"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2016-01-01' and '2016-12-31'
THEN dhu.id
END )
) AS "2016"
from dhu
inner join dhu_type on dhu.type_id = dhu_type.id
inner join patient on dhu.patient_id=patient.id
inner join fa on patient.id = fa.patient_id
inner join categorie on categorie.id = fa.cat_id
group by cat,dhu_type.type
我用图表完成了我的问题
这里是CREATE TABLE:
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET NAMES utf8 */;
/*!50503 SET NAMES utf8mb4 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
CREATE TABLE IF NOT EXISTS `categorie` (
`id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`cat` varchar(50) NOT NULL DEFAULT 'neonat',
PRIMARY KEY (`id`,`cat`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `cp` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`cp` varchar(5) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `cp` (`cp`)
) ENGINE=InnoDB AUTO_INCREMENT=4096 DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `dhu` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`patient_id` int(10) unsigned NOT NULL,
`date` date NOT NULL,
`type_id` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `FK_dhu_patient` (`patient_id`),
KEY `FK_dhu_dhu_type` (`type_id`),
CONSTRAINT `FK_dhu_dhu_type` FOREIGN KEY (`type_id`) REFERENCES `dhu_type` (`id`),
CONSTRAINT `FK_dhu_patient` FOREIGN KEY (`patient_id`) REFERENCES `patient` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=953590 DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `dhu_import` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`noip` bigint(10) unsigned zerofill NOT NULL,
`date` date NOT NULL,
`cp` varchar(5) NOT NULL,
`type` varchar(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `noip` (`noip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `dhu_type` (
`id` tinyint(3) unsigned NOT NULL AUTO_INCREMENT,
`type` varchar(4) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `type` (`type`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `dpt` (
`dpt` tinyint(3) unsigned DEFAULT NULL,
`abrev` char(3) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `fa` (
`patient_id` int(10) unsigned NOT NULL,
`cat_id` tinyint(3) unsigned NOT NULL,
PRIMARY KEY (`patient_id`,`cat_id`),
KEY `idx_cat_id_pat_id` (`cat_id`,`patient_id`),
CONSTRAINT `FK_fa_patient` FOREIGN KEY (`patient_id`) REFERENCES `patient` (`id`),
CONSTRAINT `FK_fa_categorie` FOREIGN KEY (`cat_id`) REFERENCES `categorie` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `fa_import` (
`noip` bigint(10) unsigned zerofill NOT NULL,
`cat` varchar(50) NOT NULL,
PRIMARY KEY (`noip`,`cat`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
CREATE TABLE IF NOT EXISTS `patient` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`noip` bigint(10) unsigned zerofill NOT NULL,
`cp_id` smallint(5) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `FK_patient_cp` (`cp_id`),
CONSTRAINT `FK_patient_cp` FOREIGN KEY (`cp_id`) REFERENCES `cp` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=262141 DEFAULT CHARSET=utf8;
/*!40101 SET SQL_MODE=IFNULL(@OLD_SQL_MODE, '') */;
/*!40014 SET FOREIGN_KEY_CHECKS=IF(@OLD_FOREIGN_KEY_CHECKS IS NULL, 1, @OLD_FOREIGN_KEY_CHECKS) */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
这是一项改进性能的修改(选择categorie.id而不是categorie.cat):
这是我发现的最好的查询,感谢@RickJames& @BillKarwin
select categorie.cat
,dhu_type.`type`
,t.`2013`
,t.`2014`
,t.`2015`
,t.`2016`
from ( select fa.cat_id as catid
,dhu.type_id typid
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2013-01-01' and '2013-12-31'
THEN dhu.id
END )
) AS "2013"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2014-01-01' and '2014-12-31'
THEN dhu.id
END )
) AS "2014"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2015-01-01' and '2015-12-31'
THEN dhu.id
END )
) AS "2015"
,COUNT(DISTINCT(
CASE WHEN dhu.date between '2016-01-01' and '2016-12-31'
THEN dhu.id
END )
) AS "2016"
from dhu
inner join patient on dhu.patient_id=patient.id
inner join fa on patient.id = fa.patient_id
group by fa.cat_id, dhu.type_id ) t
inner join categorie on t.catid = categorie.id
inner join dhu_type on t.typid = dhu_type.id
order by categorie.cat,dhu_type.`type`
答案 0 :(得分:1)
MySQL可以很好地使用十亿行表。
任何数据库引擎都受磁盘速度和缓存RAM(或者很少)的影响。
教科书说要标准化所有内容,但我建议4-char type
不值得正常化。同上5-char cp
。
除非您确实希望输出行全部为零,否则请在WHERE dhu.date between '2016-01-01' and '2016-12-31'
之前添加此GROUP BY
。
关注我的建议here:许多架构设计(fa
)。这个可以加速MySQL的查询。 (我不知道同样的原则是否适用于Postgres。)