Question

我有以下代码来创建缩略图并保存图像。但是，在大约1000个项目之后，它会引发错误too many open files。这是从哪里来的？我将如何修复代码？

def download_file(url, extension='jpg'):
    """ Download a large file.  Return path to saved file.
    """
    req = requests.get(url)
    if not req.ok:
        return None

    guid = str(uuid.uuid4())
    tmp_filename = '/tmp/%s.%s' % (guid, extension)
    with open(tmp_filename, 'w') as f:
        for chunk in req.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)
                f.flush()
    return tmp_filename


def update_artwork_item(item):

    # Download the file
    tmp_filename = util.download_file(item.artwork_url)

    # Create thumbs
    THUMB_SIZES = [(1000, 120), (1000, 30)]
    guid = str(uuid.uuid4())
    S3_BASE_URL = 'https://s3-us-west-1.amazonaws.com/xxx/'

    try:

        for size in THUMB_SIZES:
            outfile = '%s_%s.jpg' % (guid, size[1])
            img = Image.open(tmp_filename).convert('RGB')
            img.thumbnail(size, Image.ANTIALIAS)
            img.save(outfile, "JPEG")
            s3_cmd = '%s %s premiere-avails --norr --public' % (S3_CMD, outfile) ## doesn't work half the time
            x = subprocess.check_call(shlex.split(s3_cmd))
            if x: raise
            subprocess.call(['rm', outfile], stdout=FNULL, stderr=subprocess.STDOUT)

    except Exception, e:

        print '&&&&&&&&&&', Exception, e

    else:
        # Save the artwork icons
        item.artwork_120 = S3_BASE_URL + guid + '_120.jpg'
        item.artwork_30 = S3_BASE_URL + guid + '_30.jpg'

        # hack to fix parallel saving
        while True:
            try:
                item.save()
            except Exception, e:
                print '******************', Exception, e
                time.sleep(random.random()*1e-1)
                continue
            else:
                subprocess.call(['rm', tmp_filename], stdout=FNULL, stderr=subprocess.STDOUT)
                break

Answer 1

几乎可以肯定你使用subprocess.call。 subprocess.call是异步的，并返回一个管道对象，您负责关闭该管道对象。（See the documentation）。所以发生的事情是，每次调用subprocess.call时，都会返回一个新的管道对象，并最终耗尽文件句柄。

到目前为止，最简单的方法是通过调用os.remove而不是管道到Unix rm命令来从Python中删除文件。你使用check_call是可以的，因为check_call是同步的，不会返回你必须关闭的文件对象。

使用subprocess.call会导致“打开的文件太多”

1 个答案: