Question

C ++标准库将数据结构与算法分开，例如std::sort：

template< class RandomAccessIterator >
void sort( RandomAccessIterator first, RandomAccessIterator last );

当算法需要中间临时空间时，我想保持算法和数据结构的分离。

考虑到这一目标，我想实现一种图像算法，该算法需要输入和输出图像之间的中间划痕空间。可以在函数调用中分配必要的临时空间，但是由于具有相同大小的图像的这些调用的大小和频率，将严重降低性能。这使得将数据结构与算法分离起来要困难得多。

实现这一目标的一种可能方法如下：

// Algorithm function
template<typename InputImageView, typename OutputImageView, typename ScratchView>
void algorithm(
  InputImageView inputImageView, 
  OutputImageView outputImageView, 
  ScratchView scratchView
);

// Algorithm class with scratch space
template<typename DataStructure>
class Algorithm {
public:
  template<typename InputImageView,typename OutputImageView>
  void operator()(
  InputImageView inputImageView, 
  OutputImageView outputImageView
  ){
    m_scratch.resize(inputImageView.size());
    algorithm(inputImageView,outputImageView,makeView(m_scratch));
  }

private:
  DataStructure m_scratch;
}

上面是一个有效的算法+划痕空间设计，还是有更好的方法？

旁注：我正在使用boost::gil库

Answer 1

我认为在这种情况下，我会让算法允许您传递（引用或指向）暂存空间的结构，并为该参数赋予默认值。这样，当/如果分配结构的额外时间不是问题时，用户可以在不传递结构的情况下调用函数，但是如果（例如）构建可以从重复使用相同空间中受益的处理管道，则可以传递一个

Answer 2

如果使用功能对象，则可以携带所需的任何状态。

两个有用的算法是transform和accumulate。

transform可以使用函数对象对序列中的每个对象执行转换：

class TransformerWithScratchSpace {
public:
    Target operator()(const Source& source);
};

vector<Source> sources;
vector<Target> targets;
targets.reserve(sources.size());

transform(sources.begin(), sources.end(),
          back_inserter<Target>(targets),
          TransformerWithScratchSpace());

accumulate可以获取一个函数对象，它将所有对象累积到自身中。结果是累积的对象。累加器本身不需要生成任何东西。

class Accumulator {
public:
    Accumulator& operator+()(const Source& source);
};

vector<Source> sources;

Accumulator accumulated = accumulate(sources.begin(), sources.end(),
                                     Accumulator());

Answer 3

您的初始问题使用resize()呈现的设计效率不高，因为调整大小可能不仅需要分配，还会将现有内容从旧分配复制到新分配。它还需要在释放旧空间之前分配和填充新空间，从而增加最大峰值内存使用量。

最好为客户端代码提供一些方法来计算必须提供多大的结构作为临时空间，然后断言传递的临时空间满足库例程在入口时的需要。计算可以是算法类的另一种方法，或者临时空间对象的分配/工厂可以采用适当的代表性参数（正确的大小/形状或大小本身）并返回合适且可重复使用的临时空间对象。

工作者算法不应该以任何方式“操纵”临时空间，以便在被要求使用它时使其适合，因为这种操作往往很昂贵。

Answer 4

正如你所提到的，这个问题可以被认为远远超出了临时空间中的图像。我实际上遇到过许多不同的形式（内存部分，类型数组，线程，网络连接......）。

所以我最终做的就是给自己写一个通用的“BufferPool”。它是一个管理任何形式的缓冲区对象的类，无论是字节数组，其他内存块，还是（在您的情况下）分配的图像。我从ThreadPool借用了这个想法。

这是一个相当简单的类，它维护着一个Buffer个对象池，当你需要它时，你可以acquire一个缓冲区，当你完成它时release它可以回到池中用它。 acquire函数将检查池中是否有可用缓冲区，如果没有，将创建一个新缓冲区。如果池中有一个，它将reset Buffer，即清除它，使其行为与新创建的一样。

然后我有几个这个BufferPool的静态实例，我使用的每个Buffer类型都有一个：一个用于byte数组，一个用于char数组，...（我用Java编码，以防你想知道...... :)然后我在我正在编写的库函数的所有中使用这些静态实例。这允许我，例如，我的加密函数可以与我的二进制展平函数或我的应用程序中的任何其他代码共享字节数组。通过这种方式，我可以最大限度地重复使用这些对象，并且在很多情况下，它给我带来了主要性能提升。

在C ++中，您可以通过基于此池技术为您需要的数据结构编写custom allocator来非常优雅地实现此use-everywhere方案（感谢Andrew指出这一点;请参阅注释）。

我为字节数组缓冲区做的一件事是acquire函数将接受minimumLength参数，该参数指定我需要的缓冲区的最小大小。然后，它将仅从池中返回至少此长度的字节数组，或者如果池为空或仅包含较小的图像，则创建一个新的字节数组。您可以对图像缓冲区使用相同的方法。让acquire函数接受minWidth和minHeight参数，然后从池中返回至少这些维度的图像，或者创建具有这些维度的图像。然后，您可以让reset函数仅清除图像的（0,0）到（minWidth，minHeight）部分，如果您甚至需要清除它。

我决定在我的代码中不担心的一个功能，但您可能需要考虑应用程序的运行时间以及它将处理的图像大小有多少是您是否要限制缓冲区大小以某种方式释放缓存的图像以减少应用程序的内存使用。

举个例子，这是我用于ByteArrayPool的代码：

public class ByteArrayPool {

    private static final Map<Integer, Stack<byte[]>> POOL = new HashMap<Integer, Stack<byte[]>>();

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after clearing
     * it with 0's, if one is available. Otherwise returns a fresh <code>byte[]</code>
     * of the given length.
     * 
     * @param length the length of the <code>byte[]</code>
     * @return a fresh or zero'd buffer object
     */
    public static byte[] acquire(int length) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            return new byte[length];
        }
        if (stack.empty()) return new byte[length];
        byte[] result = stack.pop();
        Arrays.fill(result, (byte) 0);
        return result;
    }

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after optionally clearing
     * it with 0's, if one is available. Otherwise returns a fresh <code>byte[]</code>
     * of the given length.<br/>
     * <br/>
     * If the initialized state of the needed <code>byte[]</code> is irrelevant, calling this
     * method with <code>zero</code> set to <code>false</code> leads to the best performance.
     * 
     * @param length the length of the <code>byte[]</code>
     * @param zero T - initialize a reused array to 0
     * @return a fresh or optionally zero'd buffer object
     */
    public static byte[] acquire(int length, boolean zero) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            return new byte[length];
        }
        if (stack.empty()) return new byte[length];
        byte[] result = stack.pop();
        if (zero) Arrays.fill(result, (byte) 0);
        return result;
    }

    /**
     * Returns a <code>byte[]</code> of the given length from the pool after setting all
     * of its entries to the given <code>initializationValue</code>, if one is available.
     * Otherwise returns a fresh <code>byte[]</code> of the given length, which is also
     * initialized to the given <code>initializationValue</code>.<br/>
     * <br/>
     * For performance reasons, do not use this method with <code>initializationValue</code>
     * set to <code>0</code>. Use <code>acquire(<i>length</i>)</code> instead.
     * 
     * @param length the length of the <code>byte[]</code>
     * @param initializationValue the
     * @return a fresh or zero'd buffer object
     */
    public static byte[] acquire(int length, byte initializationValue) {
        Stack<byte[]> stack = POOL.get(length);
        if (stack==null) {
            if (CompileFlags.DEBUG) System.out.println("Creating new byte[] pool of lenth "+length);
            byte[] result = new byte[length];
            Arrays.fill(result, initializationValue);
            return result;
        }
        if (stack.empty()) {
            byte[] result = new byte[length];
            Arrays.fill(result, initializationValue);
            return result;
        }
        byte[] result = stack.pop();
        Arrays.fill(result, initializationValue);
        return result;
    }

    /**
     * Puts the given <code>byte[]</code> back into the <code>ByteArrayPool</code>
     * for future reuse.
     * 
     * @param buffer the <code>byte[]</code> to return to the pool
     */
    public static byte[] release(byte[] buffer) {
        Stack<byte[]> stack = POOL.get(buffer.length);
        if (stack==null) {
            stack = new Stack<byte[]>();
            POOL.put(buffer.length, stack);
        }
        stack.push(buffer);
        return buffer;
    }
}

然后，在我需要byte[]的代码的所有的其余部分中，我使用类似的内容：

byte[] buffer = ByteArrayPool.acquire(65536, false);
try {
    // Do something requiring a byte[] of length 65536 or longer
} finally {
    ByteArrayPool.release(buffer);
}

注意，我如何添加3个不同的acquire函数，这些函数允许我指定我需要缓冲区的“干净”程度。例如，如果我要覆盖所有这些，那么就没有必要浪费时间将它归零。

设计需要划痕空间的算法

4 个答案: