Manipulating big array in a more memory-efficient way

时间:2017-07-09 13:14:26

标签: php arrays memory memory-limit

I'm currently writing an API for an app that interfaces with a large database and need to retrieve large amounts of data from it, then spit it out as JSON.

I'm using CodeIgniter (CI) as a database interface but I don't think it's relevant here. I'm running into memory limits and I have unfortunately no way of increasing the limit because the shared hosting service won't do it.

I get about 56k rows from the database, which get put into an array by CI (zero-indexed, pretty standard). Each row has 7 fields.

All is well until I start looping through the array to modify data. The script runs into a memory limit error after a few loop iterations even though I'm just modifying the original array, and not allocating new variables, I think.

Allowed memory size of 134217728 bytes exhausted

Below is the code I'm using:

$query = $this->db->get('table');
if ($query->num_rows() > 0) {
    $result = $query->result_array();
    foreach ($result as $k => $v) {
        foreach($v as $key => $value) {
            if ($key === 'column_name') {
                $result[$k][$key] = json_decode($value);
                continue;
            }
            if ($value == null) {
                $result[$k][$key] = '';
            } else if (ctype_digit($value)) {
                $result[$k][$key] = (int) $result[$k][$key];
            }
        }
    }
    return $result;
}

Just decoding some json and casting to integers or empty strings, nothing fancy. But I will get memory limit errors on any line that mutates the $result array. Even if I remove the (memory-intensive) json_decode I will still get an error on line that simply casts to an int.

What's more, even if I remove the whole foreach, I get a memory limit error later on when I use json_encode to generate the API response.

I'm totally lost, and I really need this amount of data to be output all at once, no idea how to make this more memory-efficient (maybe with like buffers or something? never dived into this).

EDIT: for anyone interested, I managed to cut down on memory usage somewhat by making an unbuffered query to the database. This way, only 1 copy of the data is stored in the array. I also removed the foreach and treat each field specifically. The main problem, however, is probably how PHP stores arrays。这是新代码:

$query = $this->db->get('table');
$result = [];
while ($row = $query->unbuffered_row('array')) {
    if ($row['column1'] == '[]') {
        $row['column1'] = [];
    } else {
        $row['column1'] = json_decode($row['column1']);
    }
    $row['column2'] = (int) $row['column2'];
    $row['column3'] = (int) $row['column3'];
    $row['column4'] = is_null($row['column4']) ? '' : (int) $row['column4'];
    $row['column5'] = is_null($row['column5']) ? '' : (int) $row['column5'];

    $result[] = $row;
}

return $result;

1 个答案:

答案 0 :(得分:1)

有很多方法可以解决这个问题,真正的问题是你的优先事项是什么?

  • 它必须快吗?会慢吗?
  • 该低内存服务器是唯一可用的绝对资源吗?

理想的解决方案显然是升级您的服务器,假设您的任务占用大量内存,这对于运行此项目的人来说应该是一个问题。

显然,现代的方法是使用微服务来完成它,每个服务都处理一大块数据。它们可以由您编写,也可以使用AWS等云服务。

话虽如此,假设您确实仅限于当前的星座并且除了使用有限的内存服务器处理大数据之外别无其他选择,我建议使用本地文件I / O - 它不是最快的解决方案,但如果您读取数据块并继续将它们写入临时文件,您将节省内存问题,然后您可以将该文件刷新到客户端。