Numpy数组错误MemoryError

时间:2017-12-05 09:03:38

标签: python arrays numpy ubuntu-14.04

运行模型并在numpy数组中返回结果,我得到内存错误。

版本

  • Ubuntu 14.04.3 LTS
  • EC2 g2.2xlarge image: ami-125b2c72 15GB RAM,8vCPU
  • Python:Python 2.7.6
  • numpy.version.version:1.13.3

*使用16GB内存在MacOS中运行良好。

错误

Using TensorFlow backend.
Dataset: 6000 train images.
Descriptions: train=6000
Vocabulary Size: 7579
Photos: train=6000
Description Length: 34
Traceback (most recent call last):
  File "main.py", line 291, in <module>
    X1train, X2train, ytrain = create_sequences(tokenizer, max_length, train_descriptions, train_features)
  File "main.py", line 225, in create_sequences
    return array(X1), array(X2), array(y)
MemoryError

代码here

def create_sequences(tokenizer, max_length, descriptions, photos):
    """Creates sequences of images, input sequences and output words for an image.

    X1,     X2 (text sequence),                         y (word)
    photo   startseq,                                   little
    photo   startseq, little,                           girl
    photo   startseq, little, girl,                     running
    photo   startseq, little, girl, running,            in
    photo   startseq, little, girl, running, in,        field
    photo   startseq, little, girl, running, in, field, endseq

    :param tokenizer:
    :param max_length:
    :param descriptions:
    :param photos:
    :return:
    """
    X1, X2, y = [], [], []
    # Walk through each image identifier.
    for desc_key, desc_list in descriptions.iteritems():
        # Walk through each description for the image.
        for desc in desc_list:
            # Encode the sequence.
            seq = tokenizer.texts_to_sequences([desc])[0]
            # Split one sequence into multiple X,Y pairs.
            for i in range(1, len(seq)):
                # Split into input and output pair.
                in_seq, out_seq = seq[:i], seq[i]
                # Pad input sequence.
                in_seq = pad_sequences([in_seq], maxlen=max_length)[0]
                # Encode output sequence
                out_seq = to_categorical([out_seq], num_classes=vocab_size)[0]
                # Store.
                X1.append(photos[desc_key][0])
                X2.append(in_seq)
                y.append(out_seq)
    return array(X1), array(X2), array(y)

内存信息

MemTotal:       15400948 kB
MemFree:        14558676 kB
Buffers:             388 kB
Cached:           604332 kB
SwapCached:         3764 kB
Active:           301464 kB
Inactive:         315596 kB
Active(anon):       1344 kB
Inactive(anon):    11296 kB
Active(file):     300120 kB
Inactive(file):   304300 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       1048572 kB
SwapFree:        1029036 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:          8672 kB
Mapped:             2928 kB
Shmem:               228 kB
Slab:             110036 kB
SReclaimable:      90396 kB
SUnreclaim:        19640 kB
KernelStack:        1416 kB
PageTables:         3128 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     8749044 kB
Committed_AS:     164204 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       36252 kB
VmallocChunk:   34359696924 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       71680 kB
DirectMap2M:    15656960 kB

I created交换文件,14GB,30秒后结果相同。

root@ip-172-31-29-206:~# free -h
             total       used       free     shared    buffers     cached
Mem:           14G        14G       276M         0B       292K       3.7M
-/+ buffers/cache:        14G       280M
Swap:          14G       4.5G        10G
root@ip-172-31-29-206:~# free -h
             total       used       free     shared    buffers     cached
Mem:           14G        10G       4.5G         0B       292K       6.8M
-/+ buffers/cache:        10G       4.5G
Swap:          14G       4.3G        10G

0 个答案:

没有答案