Question

我遇到以下错误

以退出代码137（被信号9：SIGKILL中断）结束的过程

同时尝试仅对10,000张图像执行以下代码。考虑到这些天的数据量，我认为映像没有设置这么大，并且在添加了一些内存检查之后，似乎并没有内存用完，这是我在看到137的退出代码时所期望的。第二双眼睛将不胜感激！

代码

import numpy as np
import joblib
from tqdm import tqdm
from keras.preprocessing import image
from keras.applications import vgg16

# Path to folders with training data
img_path = Path()

images = []
labels = []

# Load all the images
for img in tqdm(os.listdir("training_data")):
   # Load the image from disk
   img = image.load_img(img)

   # Convert the image to a numpy array
   image_array = image.img_to_array(img)

   # Add the image to the list of images
   print("Number of images " + str(len(images)))
   print("Memory size of images list " + str(sys.getsizeof(images)))
   images.append(image_array)

   # the expected value should be 0
   labels.append(0)

输出：

图像列表的内存大小77848

4％|▍| 8919/233673 [06:42 <9:06:24，6.86it / s]

图片数量8919

图像列表的内存大小77848

4％|▍| 8920/233673 [06:42 <11:26:09，5.46it / s]

以退出代码137（被信号9：SIGKILL中断）结束的过程

基本上，我正在尝试扩展此示例，说明如何使用VGG16模型从您自己的图像中提取特征，以便以后可以通过以Sigmoid的密集层完成模型来对其进行分类。该示例可处理100张图像，但由于我现在拥有更大的示例数据集，因此失败了。

from pathlib import Path
import numpy as np
import joblib
from keras.preprocessing import image
from keras.applications import vgg16

# Path to folders with training data
img_path = Path("training_data")

images = []
labels = []

# Load all the images
for img in not_dog_path.glob("*.png"):
    # Load the image from disk
    img = image.load_img(img)

    # Convert the image to a numpy array
    image_array = image.img_to_array(img)

    # Add the image to the list of images
    print("Number of images " + str(len(images)))
    print("Memory size of images list " + str(sys.getsizeof(images)))
    images.append(image_array)

    # the expected value should be 0
    labels.append(0)

# Load all the dog images
for img in dog_path.glob("*.png"):
    # Load the image from disk
    img = image.load_img(img)

    # Convert the image to a numpy array
    image_array = image.img_to_array(img)

    # Add the image to the list of images
    images.append(image_array)

    # For each 'dog' image, the expected value should be 1
    labels.append(1)

# Create a single numpy array with all the images we loaded
x_train = np.array(images)

# Also convert the labels to a numpy array
y_train = np.array(labels)

# Normalize image data to 0-to-1 range
x_train = vgg16.preprocess_input(x_train)

# Load a pre-trained neural network to use as a feature extractor
pretrained_nn = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(64, 64, 3))

# Extract features for each image (all in one pass)
features_x = pretrained_nn.predict(x_train)

# Save the array of extracted features to a file
joblib.dump(features_x, "x_train.dat")

# Save the matching array of expected values to a file
joblib.dump(y_train, "y_train.dat")
    enter code here

Ram信息

free -m
              total        used        free      shared  buff/cache   available
Mem:         386689      162686      209771          39       14231      222703
Swap:         30719        5156       25563

Ran dmesg命令，谢谢@Matias Valdenegro的建议

[4550163.834761] Out of memory: Kill process 21996 (python) score 972 or sacrifice child
[4550163.836103] Killed process 21996 (python) total-vm:415564288kB, anon-rss:388981876kB, file-rss:1124kB, shmem-rss:4kB

Answer 1

sys.getsizeof(x)返回这样的列表结构的大小，而不是其项目的大小。您的实际列表太大。

l=[0]
sys.getsizeof(l)
#72
l[0]=list(range(1000000))
sys.getsizeof(l)
#72

Answer 2

在大多数情况下，这是由于内存使用过多或与多处理中的问题有关。

追加到列表时退出代码137

2 个答案: