将DataFrame转换为数组后,将使用8倍以上的内存。为什么?

时间:2019-04-17 15:00:03

标签: python pandas numpy dataframe

我想在我的DataFrame上使用StandardScaler。 df当前有303MB。应用StandardScaler之后,DataFrame已成为一个numpy数组-大小为2529MB。那怎么可能?我该如何解决这个问题?

我只是在这里手动对其进行了转换,没有使用StandardScaler,但是结果是相同的。如此庞大的数据集,计算时间更长。我该如何解决这个问题?

$categories[$categories_id]["products"] = $sql->select("#products", $sql->resolveAlias("products") . ".products_id IN (SELECT products_id FROM " . $sql->resolveAlias("products_to_categories") . " WHERE " . $sql->resolveAlias("products_to_categories") . ".categories_id = '" . $categories_id . "')", array(
        $sql->resolveAlias("products") . ".products_id",
        //$sql->resolveAlias("products") . ".products_ean",
        $sql->resolveAlias("products") . ".products_quantity",
        $sql->resolveAlias("products") . ".products_model",
        "CONCAT('/media/images/thumb/', " . $sql->resolveAlias("products") . ".products_image" . ") AS products_image",
        "ROUND(" . $sql->resolveAlias("products") . ".products_price * 1.19, 2) AS products_price",
        // Artikelnummer beschneiden: Alles bis zum ersten Vorkommen eines Unterstriches "_" entfernen
        "SUBSTRING(" . $sql->resolveAlias("products") . ".products_model, INSTR(" . $sql->resolveAlias("products") . ".products_model, '_')+1) as products_model_cut",
        $sql->resolveAlias("products_description") . ".products_name",
        $sql->resolveAlias("products_description") . ".products_description ",
        //"SUBSTRING(" . $sql->resolveAlias("products_description") . ".products_description
        //"SUBSTRING(" . $sql->resolveAlias("products_description") . ".products_description, 0,250) as products_description",
        //$sql->resolveAlias("products_description") . ".products_short_description",
        //$sql->resolveAlias("products_description") . ".products_keywords",
        //$sql->resolveAlias("products_description") . ".products_url",
    ), NULL, array($sql->resolveAlias("products_description") => $sql->resolveAlias("products_description") . ".products_id = " . $sql->resolveAlias("products") . ".products_id"))->asArray("products_id");

0 个答案:

没有答案