Question

我有一个简单的数据表，我想从查询中选择大约40％的行。

我现在可以通过首先查询查找行数然后运行另一个排序并选择第n行的查询来执行此操作：

select count(*) as `total` from mydata;

可能返回类似93,93 * 0.4 = 37

的内容

select * from mydata order by `field` asc limit 37,1;

我可以将这两个查询合并到一个查询中吗？

Answer 1

这将为您提供大约40％的百分位数，它返回40％的行小于它的行。它根据行数与第40个百分点的距离对行进行排序，因为没有行可能正好落在第40个百分点上。

SELECT m1.field, m1.otherfield, count(m2.field) 
  FROM mydata m1 INNER JOIN mydata m2 ON m2.field<m1.field
GROUP BY 
   m1.field,m1.otherfield
ORDER BY 
   ABS(0.4-(count(m2.field)/(select count(*) from mydata)))
LIMIT 1

Answer 2

作为徒劳的练习（你当前的解决方案可能会更快更优先），如果表格是MYISAM（或者你可以使用InnoDB的近似值）：

SET @row =0;
SELECT x.*
FROM information_schema.tables
JOIN (
  SELECT @row := @row+1 as 'row',mydata.*
  FROM mydata
  ORDER BY field ASC
) x
ON x.row = round(information_schema.tables.table_rows * 0.4)
WHERE information_schema.tables.table_schema = database()
AND information_schema.tables.table_name = 'mydata';

Answer 3

还有this解决方案，它使用GROUP_CONCAT制作的怪物字符串。我不得不在输出上达到最大值才能让它工作：

SET SESSION group_concat_max_len = 1000000;

MySql向导：随意评论这些方法的相对性能。

从MySQL中选择第n个百分位数

3 个答案: