Python程序的每个部分使用了多少内存?

时间:2018-02-22 17:13:23

标签: python memory-management machine-learning

目前,我只是在400 rgb手表图像上运行PCA和KNN,以找到其中最相似的手表。我想知道我在程序的每个部分使用了多少内存。出于这个原因,我遵循了这个link,我的源代码如下:

import cv2
import numpy as np
import os
from glob import glob
from sklearn.decomposition import PCA
from sklearn import neighbors
from sklearn import preprocessing
import os
import psutil

def memory_usage():
    process = psutil.Process(os.getpid())
    print(round(process.memory_info().rss / (10 ** 9), 3), 'GB')

data = []

# Read images from file
for filename in glob('Watches/*.jpg'):

    img = cv2.imread(filename)
    height, width = img.shape[:2]
    img = np.array(img)

    # Check that all my images are of the same resolution
    if height == 529 and width == 940:

        # Reshape each image so that it is stored in one line
        img = np.concatenate(img, axis=0)
        img = np.concatenate(img, axis=0)
        data.append(img)

memory_usage()

# Normalise data
data = np.array(data)
Norm = preprocessing.Normalizer()
Norm.fit(data)
data = Norm.transform(data)

memory_usage()

# PCA model
pca = PCA(0.95)
pca.fit(data)
data = pca.transform(data)

memory_usage()

# K-Nearest neighbours
knn = neighbors.NearestNeighbors(n_neighbors=4, algorithm='ball_tree', metric='minkowski').fit(data)
distances, indices = knn.kneighbors(data)
print(indices)

memory_usage()

输出如下:

0.334 GB  # after loading images
1.712 GB  # after data normalisation
1.5 GB    # after pca
1.503 GB  # after knn

这些输出是什么意思?

它们是否代表此时使用的内存,这是程序对象和函数所需内存的直接指示,直到这一点(或事情更复杂)?

例如,为什么数据规范化后的内存使用率高于PCA后的内存使用率?

0 个答案:

没有答案