我有一个表(MySQL),它有一个名为binID的列。此列中的值范围为1到70。
我想要做的是选择此列的唯一值(应该是1到70之间的数字),然后使用每个(将其称为theBinID)作为参数迭代到另一个SELECT语句中,例如:< / p>
SELECT * FROM MyTable WHERE binID = theBinID ORDER BY createdDate DESC LIMIT 10
基本上,我希望为每个binID获取10个最新行。
我不相信有一种方法可以用一个基本的SQL语句来做到这一点,虽然我希望这是一个答案,所以我编写了一个存储过程来创建一个游标选择binIDs的DISTINCT,然后迭代它并填充临时表。
我的问题是,这是为了优化,如果我获取100K行,我得到1.7秒的平均时间。执行我的存储过程以获得700行(70个分区的10个记录)需要1.4秒。我意识到0.3秒可以被认为是相当大的改进,但我希望得到100K行的亚秒级。
有更好的方法吗?
完整的存储过程如下:
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE binID INT;
DECLARE cur1 CURSOR FOR SELECT DISTINCT heatmapBinID from MEStressTest ORDER BY heatmapBinID ASC;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
DROP TEMPORARY TABLE IF EXISTS TempResults;
CREATE TEMPORARY TABLE TempResults (
`recordID` text NOT NULL,
`queryTerm` text NOT NULL,
`recordCreated` double(11,0) NOT NULL,
`recordByID` text NOT NULL,
`recordByName` text NOT NULL,
`recordText` text NOT NULL,
`recordSource` text NOT NULL,
`rerecordCount` int(11) NOT NULL DEFAULT '0',
`timecodeOffset` int(11) NOT NULL DEFAULT '-1',
`recordByImageURL` text NOT NULL,
`canDelete` int(11) NOT NULL DEFAULT '1',
`heatmapBinID` int(11) DEFAULT NULL,
`timelineBinID` int(11) DEFAULT NULL,
PRIMARY KEY (`recordID`(20))
);
OPEN cur1;
read_loop: LOOP
FETCH cur1 INTO binID;
IF done THEN
LEAVE read_loop;
END IF;
INSERT INTO TempResults (recordID, queryTerm, recordCreated, recordByID, recordByName, recordText, recordSource, rerecordCount, timecodeOffset, recordByImageURL, canDelete, heatmapBinID, timelineBinID)
SELECT * FROM MEStressTest WHERE heatmapBinID = binID ORDER BY recordCreated DESC LIMIT numRecordsPerBin;
END LOOP;
CLOSE cur1;
SELECT * FROM TempResults ORDER BY heatmapBinID ASC, recordCreated DESC;
END
答案 0 :(得分:1)
尝试在MySQL中模拟ROW_NUMBER分区:http://www.sqlfiddle.com/#!2/fd8b5/4
鉴于此数据:
create table sentai(
band varchar(50),
member_name varchar(50),
member_year int not null
);
insert into sentai(band, member_name, member_year) values
('BEATLES','JOHN',1960),
('BEATLES','PAUL',1961),
('BEATLES','GEORGE',1962),
('BEATLES','RINGO',1963),
('VOLTES V','STEVE',1970),
('VOLTES V','MARK',1971),
('VOLTES V','BIG BERT',1972),
('VOLTES V','LITTLE JOHN',1973),
('VOLTES V','JAMIE',1964),
('ERASERHEADS','ELY',1990),
('ERASERHEADS','RAYMUND',1991),
('ERASERHEADS','BUDDY',1992),
('ERASERHEADS','MARCUS',1993);
对象,找到每个乐队的所有三个最新成员。
首先,我们必须根据大多数年份(按降序排列)在每个成员上放置一个row_number
select *,
@rn := @rn + 1 as rn
from (sentai s, (select @rn := 0) as vars)
order by s.band, s.member_year desc;
输出:
| BAND | MEMBER_NAME | MEMBER_YEAR | @RN := 0 | RN |
|-------------|-------------|-------------|----------|----|
| BEATLES | RINGO | 1963 | 0 | 1 |
| BEATLES | GEORGE | 1962 | 0 | 2 |
| BEATLES | PAUL | 1961 | 0 | 3 |
| BEATLES | JOHN | 1960 | 0 | 4 |
| ERASERHEADS | MARCUS | 1993 | 0 | 5 |
| ERASERHEADS | BUDDY | 1992 | 0 | 6 |
| ERASERHEADS | RAYMUND | 1991 | 0 | 7 |
| ERASERHEADS | ELY | 1990 | 0 | 8 |
| VOLTES V | LITTLE JOHN | 1973 | 0 | 9 |
| VOLTES V | BIG BERT | 1972 | 0 | 10 |
| VOLTES V | MARK | 1971 | 0 | 11 |
| VOLTES V | STEVE | 1970 | 0 | 12 |
| VOLTES V | JAMIE | 1964 | 0 | 13 |
然后我们在成员处于不同频段时重置行号:
select *,
@rn := IF(@pg = s.band, @rn + 1, 1) as rn,
@pg := s.band
from (sentai s, (select @pg := null, @rn := 0) as vars)
order by s.band, s.member_year desc;
输出:
| BAND | MEMBER_NAME | MEMBER_YEAR | @PG := NULL | @RN := 0 | RN | @PG := S.BAND |
|-------------|-------------|-------------|-------------|----------|----|---------------|
| BEATLES | RINGO | 1963 | (null) | 0 | 1 | BEATLES |
| BEATLES | GEORGE | 1962 | (null) | 0 | 2 | BEATLES |
| BEATLES | PAUL | 1961 | (null) | 0 | 3 | BEATLES |
| BEATLES | JOHN | 1960 | (null) | 0 | 4 | BEATLES |
| ERASERHEADS | MARCUS | 1993 | (null) | 0 | 1 | ERASERHEADS |
| ERASERHEADS | BUDDY | 1992 | (null) | 0 | 2 | ERASERHEADS |
| ERASERHEADS | RAYMUND | 1991 | (null) | 0 | 3 | ERASERHEADS |
| ERASERHEADS | ELY | 1990 | (null) | 0 | 4 | ERASERHEADS |
| VOLTES V | LITTLE JOHN | 1973 | (null) | 0 | 1 | VOLTES V |
| VOLTES V | BIG BERT | 1972 | (null) | 0 | 2 | VOLTES V |
| VOLTES V | MARK | 1971 | (null) | 0 | 3 | VOLTES V |
| VOLTES V | STEVE | 1970 | (null) | 0 | 4 | VOLTES V |
| VOLTES V | JAMIE | 1964 | (null) | 0 | 5 | VOLTES V |
然后我们只选择每个乐队中最近的三个成员:
select x.band, x.member_name, x.member_year
from
(
select *,
@rn := IF(@pg = s.band, @rn + 1, 1) as rn,
@pg := s.band
from (sentai s, (select @pg := null, @rn := 0) as vars)
order by s.band, s.member_year desc
) as x
where x.rn <= 3
order by x.band, x.member_year desc;
输出:
| BAND | MEMBER_NAME | MEMBER_YEAR |
|-------------|-------------|-------------|
| BEATLES | RINGO | 1963 |
| BEATLES | GEORGE | 1962 |
| BEATLES | PAUL | 1961 |
| ERASERHEADS | MARCUS | 1993 |
| ERASERHEADS | BUDDY | 1992 |
| ERASERHEADS | RAYMUND | 1991 |
| VOLTES V | LITTLE JOHN | 1973 |
| VOLTES V | BIG BERT | 1972 |
| VOLTES V | MARK | 1971 |
虽然在MySQL上尚未提供窗口函数(例如,ROW_NUMBER OVER PARTITION),但只需使用变量进行模拟。如果这比光标方法更快,请告诉我们
在支持窗口的RDBMS上看起来如何:http://www.sqlfiddle.com/#!1/fd8b5/6
with member_recentness as
(
select row_number() over each_band as recent, *
from sentai
window each_band as (partition by band order by member_year desc)
)
select *
from member_recentness
where recent <= 3;
输出:
| RECENT | BAND | MEMBER_NAME | MEMBER_YEAR |
|--------|-------------|-------------|-------------|
| 1 | BEATLES | RINGO | 1963 |
| 2 | BEATLES | GEORGE | 1962 |
| 3 | BEATLES | PAUL | 1961 |
| 1 | ERASERHEADS | MARCUS | 1993 |
| 2 | ERASERHEADS | BUDDY | 1992 |
| 3 | ERASERHEADS | RAYMUND | 1991 |
| 1 | VOLTES V | LITTLE JOHN | 1973 |
| 2 | VOLTES V | BIG BERT | 1972 |
| 3 | VOLTES V | MARK | 1971 |
答案 1 :(得分:0)
SELECT * FROM MyTable WHERE binID IN (SELECT DISTINCT(bin_id) FROM mysql_table) ORDER BY createdDate DESC LIMIT 10;
这没有经过测试,也没用过语法。
添加索引以提高性能。
答案 2 :(得分:0)
如果你尝试在没有任何连接键的情况下连接2个表,它将是2个表的笛卡尔积,即:
SELECT *
FROM MyTable t
INNER JOIN (SELECT DISTINCT binId FROM MyTable) AS u
WHERE
t.binID = theBinID
ORDER BY t.createdDate DESC LIMIT 10
您可以参考this