Question

问题：

首先：这是一个高性能的应用程序，因此时间执行是最重要的方面。我有一个后端系统，可以计算一些昂贵的功能：

template<typename C, typename R>
R backEndFunction (C &code){
  ...
}

其中code是属于指标空间的1xN向量。请注意，由于N很大，两个代码之间的计算距离很昂贵（此问题称为Curse of Dimensionality）。

我正在设计一个“相似性缓存”（遵循LRU策略），它位于提交查询代码的用户和后端系统之间。因此，每个缓存的元素都是(C,R)类型的对（实际上是三元组，我们将在后面看到），其中包含缓存的代码和相关的结果。

在用户提供查询代码的情况下，它的工作原理如下：

计算查询与每个缓存代码之间的距离，并找到最小距离。请注意，如果代码相同，则距离为0.
如果距离低于给定的阈值，则会有缓存命中，因此请将命中元素放在缓存前面，最后返回相对于命中元素的结果。
否则，调用后端系统函数，在缓存前插入新对，弹出最后一个，最后返回计算结果。

缓存设计：

首先，这个缓存应该包含10k个元素。这很重要，因为如果我们说过100k元素，我们必须用Nearest Neighbor算法（例如LSH）来解决第1步。相反，使用这几个元素，并行蛮力方法仍然是可行的。

并行部分已使用OpenMP实现。由于使用OpenMP不能用于非类似矢量的结构，因此我们的缓存不能是简单的std::queue或类似的。

所以，这是我的解决方案，涉及两个数据结构：

std::vector<CacheElem> values，其中CacheElem是三元组(code, result, listElem)，其中listElem是第二个结构元素的迭代器（下图）。
std::list<size_t> lru实施lru政策（你不说？），其中lru[i]是values中相应元素的索引。

因此，如果lru[i]=j，则values[j].listElem是lru[j]的迭代器。因此，当缓存收到查询代码时：

并行计算查询与values
如果存在缓存命中，请使用迭代器listElem作为lru中相应元素的引用，并将其放在列表前面。
如果存在缓存未命中，则计算查询结果（使用后端），在lru前面推送最后一个元素（必须删除的元素）的相同索引，替换{ {1}}使用查询代码，值和values[lru[size]]，最后弹出最后一个元素。

显然，如果缓存未满，则不需要所有“pop last element”部分。

代码（第一个版本，尚未测试）：

lru.begin()

我的问题：

知道每个查询我必须检查所有缓存的元素，你认为有一个比我更高效的解决方案吗？特别是考虑使用/** * C = code type * R = result type * D = distance type (e.g. float for euclidean) */ template <typename C, typename R, typename D> class Cache { typedef std::shared_ptr<cc::Distance<C,D>> DistancePtr; public: Cache(const DistancePtr distance, const std::function<R(C)> &backEnd, const size_t size = 10000, const float treshold = 0); R Query(const C &query); void PrintCache(); private: struct Compare{ Compare(D val = std::numeric_limits<D>::max(), size_t index = 0) : val(val), index(index) {} D val; size_t index; }; #pragma omp declare reduction(minimum : struct Compare : omp_out = omp_in.val < omp_out.val ? omp_in : omp_out) initializer (omp_priv=Compare()) struct CacheElem{ CacheElem(const C &code, const R &result, std::list<size_t>::iterator listElem) : code(code), result(result), listElem(listElem) {} C code; R result; std::list<size_t>::iterator listElem; //pointing to corresponding element in lru0 }; DistancePtr distance; std::function<R(C)> backEnd; std::vector<CacheElem> values; std::list<size_t> lru; float treshold; size_t size; }; template <typename C, typename R, typename D> Cache<C,R,D>::Cache(const DistancePtr distance, const std::function<R(C)> &backEnd, const size_t size, const float treshold) : distance(distance), backEnd(backEnd), treshold(treshold), size(size) { values.reserve(size); std::cout<<"CACHE SETUP: size="<<size<<" treshold="<<treshold<<std::endl; } template <typename C, typename R, typename D> void Cache<C,R,D>::PrintCache() { std::cout<<"LRU: "; for(std::list<size_t>::iterator it=lru.begin(); it != lru.end(); ++it) std::cout<<*it<<" "; std::cout<<std::endl; std::cout<<"VALUES: "; for(size_t i=0; i<values.size(); i++) std::cout<<"("<<values[i].code<<","<<values[i].result<<","<<*(values[i].listElem)<<")"; std::cout<<std::endl; } template <typename C, typename R, typename D> R Cache<C,R,D>::Query(const C &query){ PrintCache(); Compare min; R result; std::cout<<"query="<<query<<std::endl; //Find the cached element with min distance #pragma omp parallel for reduction(minimum:min) for(size_t i=0; i<values.size(); i++){ D d = distance->compute(query, values[i].code); #pragma omp critical { std::cout<<omp_get_thread_num()<<" min="<<min.val<<" distance("<<query<<" "<<values[i].code<<")= "<<d; if(d < min.val){ std::cout<<" NEW MIN!"; min.val = d; min.index = i; } std::cout<<std::endl; } } std::cout<<"min.val="<<min.val<<std::endl; //Cache hit if(!lru.empty() && min.val < treshold){ std::cout<<"cache hit with index="<<min.index<<" result="<<values[min.index].result<<" distance="<<min.val<<std::endl; CacheElem hitElem = values[min.index]; //take the hit element to top of the queue if( hitElem.listElem != lru.begin() ) lru.splice( lru.begin(), lru, hitElem.listElem, std::next( hitElem.listElem ) ); result = hitElem.result; } //cache miss else { result = backEnd(query); std::cout<<"cache miss backend="<<result; //Cache reached max capacity if(lru.size() == size){ //last item (the one that must be removed) value is its corresponding index in values size_t lastIndex = lru.back(); //remove last element lru.pop_back(); //insert new element in the list lru.push_front(lastIndex); //insert new element in the value vector, replacing the old one values[lastIndex] = CacheElem(query, result, lru.begin()); std::cout<<" index to replace="<<lastIndex; } else{ lru.push_front(values.size()); //since we are going to inser a new element, we don't need to do size()-1 values.push_back(CacheElem(query, result, lru.begin())); } std::cout<<std::endl; } PrintCache(); std::cout<<"-------------------------------------"<<std::endl; return result; }和std::vector<CacheElem> values;？
如果这是最佳解决方案，我们知道std::list<size_t> lru;对于高性能应用程序来说不是一个好的解决方案，但对于给定的问题，我找不到更好的解决方案。您是否知道任何类似队列的结构，您可以将随机元素放在队列的顶部，我必须std::list和pop_back a（如本例所示）？

度量空间的高性能相似性缓存

0 个答案: