我有一个看起来像这样的表:
CREATE TEMPORARY TABLE MainList (
`pTime` int(10) unsigned NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MEMORY;
+------------+-------------+
| pTime | STD |
+------------+-------------+
| 1106080500 | -0.5058072 |
| 1106081100 | -0.82790455 |
| 1106081400 | -0.59226294 |
| 1106081700 | -0.99998194 |
| 1106540100 | -0.86649279 |
| 1107194700 | 1.51340543 |
| 1107305700 | 0.96225296 |
| 1107306300 | 0.53937716 |
+------------+-------------+ .. etc
pTime是我的主要关键。
我想进行一个查询,对于我表中的每一行,都会找到第一个pTime,其中STD有一个翻转符号,并且比上表的STD更远离0。 (为简单起见,想象一下我正在寻找0-STD)
以下是我想要的输出示例:
+------------+-------------+------------+-------------+
| pTime | STD | pTime_Oppo | STD_Oppo |
+------------+-------------+------------+-------------+
| 1106080500 | -0.5058072 | 1106090400 | 0.57510881 |
| 1106081100 | -0.82790455 | 1106091300 | 0.85599817 |
| 1106081400 | -0.59226294 | 1106091300 | 0.85599817 |
| 1106081700 | -0.99998194 | 1106091600 | 1.0660959 |
+------------+-------------+------------+-------------+
我似乎无法做对! 我尝试了以下方法:
SELECT DISTINCT
MainList.pTime,
MainList.STD,
b34d1.pTime,
b34d1.STD
FROM
MainList
JOIN b34d1 ON(
b34d1.pTime > MainList.pTime
AND(
(
MainList.STD > 0
AND b34d1.STD <= 0 - MainList.STD
)
OR(
MainList.STD < 0
AND b34d1.STD >= 0 - MainList.STD
)
)
);
该代码只是冻结了我的服务器。
P.S表b34d1与MainList类似,不同之处在于它包含更多元素:
mysql> select STD, Slope from b31d1 limit 10;
+-------------+--------------+
| STD | Slope |
+-------------+--------------+
| -0.44922675 | -5.2016129 |
| -0.11892021 | -8.15249267 |
| 0.62574686 | -10.19794721 |
| 1.10469057 | -12.43768328 |
| 1.52917352 | -13.08651026 |
| 1.61803899 | -13.2441349 |
| 1.82686555 | -12.04912023 |
| 2.07480736 | -11.22067449 |
| 2.45529961 | -7.84090909 |
| 1.86468335 | -6.26466276 |
+-------------+--------------+
mysql> select count(*) from b31d1;
+----------+
| count(*) |
+----------+
| 439340 |
+----------+
1行(0.00秒)
实际上MainList只是使用MEMORY引擎的b34d1的过滤版本
mysql> show create table b34d1;
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| Table | Create Table
|
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| b34d1 | CREATE TABLE `b34d1` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=339331 MAX_ROWS=539331 PACK_KEYS=1 ROW_FORMAT=FIXED |
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
编辑:我刚做了一个小实验,我对结果非常困惑:
SELECT DISTINCT
b34d1.pTime,
b34d1.STD,
Anti.pTime,
Anti.STD
FROM
b34d1
LEFT JOIN b34d1 As Anti ON(
Anti.pTime > b34d1.pTime
AND(
(
b34d1.STD > 0
AND b34d1.STD <= 0 - Anti.STD
)
OR(
b34d1.STD < 0
AND b34d1.STD >= 0 - Anti.STD
)
)
) limit 10;
+------------+-------------+------------+------------+
| pTime | STD | pTime | STD |
+------------+-------------+------------+------------+
| 1104537600 | -0.70381962 | 1104539100 | 0.73473692 |
| 1104537600 | -0.70381962 | 1104714000 | 1.46733274 |
| 1104537600 | -0.70381962 | 1104714300 | 2.02097356 |
| 1104537600 | -0.70381962 | 1104714600 | 2.60642099 |
| 1104537600 | -0.70381962 | 1104714900 | 2.01006557 |
| 1104537600 | -0.70381962 | 1104715200 | 1.97724189 |
| 1104537600 | -0.70381962 | 1104715500 | 1.85683704 |
| 1104537600 | -0.70381962 | 1104715800 | 1.2754127 |
| 1104537600 | -0.70381962 | 1104716100 | 0.87900156 |
| 1104537600 | -0.70381962 | 1104716400 | 0.72957739 |
+------------+-------------+------------+------------+
为什么第一个pTime下的所有值都相同?
答案 0 :(得分:1)
从具有某些聚合统计信息(例如最小值或最大值)的行中选择其他字段在SQL中有点混乱。这样的查询并不那么简单。您通常需要额外的连接或子查询。例如:
SELECT m.pTime, m.STD, m2.pTime AS pTime_Oppo, m2.STD AS STD_Oppo
FROM MainList AS m
JOIN
(SELECT m1.pTime, MIN(m2.pTime) AS pTime_Oppo
FROM MainList AS m1
JOIN MainList AS m2
ON m1.pTime < m2.pTime AND SIGN(m1.STD) != SIGN(m2.STD)
WHERE ABS(m1.STD) <= ABS(m2.std)
GROUP BY m1.pTime
) AS oppo ON m.pTime = oppo.pTime
JOIN MainList AS m2 ON oppo.pTime_Oppo = m2.pTime
;
使用样本数据:
INSERT INTO MainList (`pTime`, `STD`)
VALUES
(1106080500, -0.5058072),
(1106081100, -0.82790455),
(1106081400, -0.59226294),
(1106081700, -0.99998194),
(1106090400, 0.57510881),
(1106091300, 0.85599817),
(1106091600, 1.0660959),
(1106540100, -0.86649279),
(1107194700, 1.51340543),
(1107305700, 0.96225296),
(1107306300, 0.53937716),
;
结果是:
+------------+-------------+------------+-------------+ | pTime | STD | pTime_Oppo | STD_Oppo | +------------+-------------+------------+-------------+ | 1106080500 | -0.5058072 | 1106090400 | 0.57510881 | | 1106081100 | -0.82790455 | 1106091300 | 0.85599817 | | 1106081400 | -0.59226294 | 1106091300 | 0.85599817 | | 1106081700 | -0.99998194 | 1106091600 | 1.0660959 | | 1106090400 | 0.57510881 | 1106540100 | -0.86649279 | | 1106091300 | 0.85599817 | 1106540100 | -0.86649279 | | 1106540100 | -0.86649279 | 1107194700 | 1.51340543 | +------------+-------------+------------+-------------+
答案 1 :(得分:0)
任何基于ABS或SIGN等功能的解决方案或检查签名所需的任何类似物都注定对大数据集无效,因为它无法建立索引。
您正在SP中创建一个临时表,这样您就可以在不丢失任何内容的情况下更改它的架构,添加一个存储STD符号的列并将STD本身存储为unsigned将为您提供巨大的性能提升,因为您可以简单地找到第一个更大的pTime和更大的STD具有不同的符号和所有条件可以在这样的查询中使用索引(STD_positive保持STD的符号):
SELECT * from mainlist m
LEFT JOIN mainlist mu
ON mu.pTime = ( SELECT md.pTime FROM mainlist md
WHERE m.pTime < md.pTime
AND m.STD < md.STD
AND m.STD_positive <> md.STD_positive
ORDER BY md.pTime
LIMIT 1 )
这里需要LEFT JOIN来返回没有更大STD的行。如果您不需要它们,请使用简单的JOIN。即使在很多记录上,这个查询也应该运行正常,基于仔细检查EXPLAIN输出的正确索引,从STD索引开始。
答案 2 :(得分:0)
SELECT
m.pTime,
m.STD,
mo.pTime AS pTime_Oppo,
-mo.STD AS STD_Oppo
FROM MainList m
INNER JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo ON m.STD > 0 AND mo.STD > m.STD
OR m.STD < 0 AND mo.STD < m.STD
LEFT JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo2 ON mo.STD > 0 AND mo2.STD > m.STD AND mo.STD > mo2.STD
OR mo.STD < 0 AND mo2.STD < m.STD AND mo.STD < mo2.STD
WHERE mo2.pTime IS NULL