MySQL查询使用/ tmp中的10G空间并使用'Errcode:28 - 设备上没有剩余空间',但在本地工作正常

时间:2017-06-26 17:59:51

标签: mysql sql linux opensuse

我正在运行一个相当复杂的SQL语句来从一个大表(3800万行)的原始数据创建一个汇总表。 (我正试图在cache牌桌上获得当前,低位本赛季,本赛季高位,价格百分比本周/月份/赛季1美分以便稍后查询。)

INSERT INTO cache (`time`, name, price, low, high, week, month, season)
    SELECT
        MAX(`time`) AS `time`,
        name,
        MIN(CASE WHEN `time` = 1498511444 THEN price ELSE 999999 END) AS price,
        MIN(price) AS low,
        MAX(price) AS high,
        SUM(CASE WHEN `time` > 1497906644 AND price = 1 THEN 1 ELSE 0 END) / SUM(CASE WHEN `time` > 1497906644 THEN 1 ELSE 0 END) AS week,
        SUM(CASE WHEN `time` > 1480367444 AND price = 1 THEN 1 ELSE 0 END) / SUM(CASE WHEN `time` > 1480367444 THEN 1 ELSE 0 END) AS month,
        SUM(CASE WHEN `time` > 1493362800 AND price = 1 THEN 1 ELSE 0 END) / SUM(CASE WHEN `time` > 1493362800 THEN 1 ELSE 0 END) AS season
    FROM
        (SELECT
            `time`,
            name,
            MIN(price) AS price
        FROM price
        WHERE `time` > 1493362800
        GROUP BY `time`, name) AS tmp
    GROUP BY name

在price.time列上添加索引后,我设法将其降低到本地的0.6秒(之前需要30秒)。在prod(具有相同的索引)上需要很长时间(30s +)然后使用Errcode失败:28 - 设备上没有剩余空间。如果我在df运行时看到它,我看到自由空间从大约3MB / s慢慢地从9.9G减少到9.6G。然后几分钟后,空闲空间突然开始下降500MB / s,直到没有剩余空间并且查询失败。在本地,可用空间似乎没有昙花一现,但我想它可能会如此之快,以至于我的df在一个while循环中没有看到它。

如果我首先尝试创建一个包含子查询结果的表,我也会得到吃磁盘的行为:

INSERT INTO initial_cache (`time`, name, price)
SELECT
    `time`,
    name,
    MIN(price) AS price
FROM price
WHERE `time` > 1493337600
GROUP BY `time`, name

你知道为什么我的查询需要这么多空间来运行吗?为什么它会在prod上表现得如此不同?

谢谢!

2 个答案:

答案 0 :(得分:1)

当内存耗尽时,子查询往往会占用大量的临时空间。 然而,有一部分是有点多余的:在初始子查询之后检查时间:重写它给出(其中SUM(1)很奇怪):

INSERT INTO cache (`time`, name, price, low, high, week, month, season)
SELECT
    MAX(`time`) AS `time`,
    name,
    MIN(price) AS price,
    MIN(price) AS low,
    MAX(price) AS high,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS week,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS month,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS season
FROM
    (SELECT
        `time`,
        name,
        MIN(price) AS price
    FROM price
    WHERE `time` > 1498442022
    GROUP BY `time`, name) AS tmp
GROUP BY name;

可能相当于:

INSERT INTO cache (`time`, name, price, low, high, week, month, season)
SELECT
    MAX(`time`) AS `time`,
    name,
    MIN(price) AS price,
    MIN(price) AS low,
    MAX(price) AS high,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS week,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS month,
    SUM(CASE WHEN price = 1 THEN 1 ELSE 0 END) / SUM(1) AS season
FROM price
WHERE `time` > 1498442022    
GROUP BY name;

然而,由于外部查询的重写看起来很奇怪,我怀疑这是您正在寻找的结果:提供数据和预期结果以获得更好的答案。

答案 1 :(得分:0)

我没有解决这个问题,但我确实解决了这个问题。我所做的是让插入数据的程序也将数据插入到子查询形成的表中。然后我分别执行我的外部查询。所以我现在有一种两阶段缓存。出于某种原因,这一切都可以工作而不会使磁盘空间凹陷。