Question

我有一段代码，我正在从Fortran迁移到C ++，我想避免一些我必须在原始F77代码中创建的嵌套for循环结构。

问题在于：我有一个称为节点的对象向量，每个对象都包含一个向量，其中包含每个连接的其他节点对象（连接图）的索引（以及其他重要信息）。喜欢这个

struct Node {
    vector<int> conNode;
};
vector<Node> listOfNodes;
vector<int> nodeListA;    // a subset of nodes of interest stored as their vector indices

我需要查找nodeListA中节点所连接的节点，但前提是这些节点也在nodeListA中。现在，我的代码看起来像这样：

// Loop over the subset of node indices
for (int i=0; i<nodeListA.size(); i++) {
    // Loop over the nodes connected to the node i
    for (int j=0; j<listOfNodes[nodeListA[i]].conNode.size(); j++) {
        // Loop over the subset of node indices again
        for (int k=0; k<nodeListA.size(); k++) {
            // and determine if any of node i's connections are in the subset list
            if (nodeListA[k] == listOfNodes[nodeListA[i]].conNode[j]) {
               // do stuff here
            }
        }
    }
}

有一个更简单的方法来做到这一点。我觉得这样做太复杂了。如何使用标准算法库简化此代码？

Answer 1

如果您的变量应表达一组值，请使用std::set代替std::vector。那你就有了

typedef std::set<int> SetOfIndices;
SetOfIndices setOfIndices; // instead of nodeListA
for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter)
{
    Node const & node = listOfNodes[*iter];
    for (int j = 0; j < node.conNode.size(); ++j)
    {
        if (setOfIndices.find(node.conNode[j]) != setOfIndices.end())
        {
            // do stuff here
        }
    }
}

修改正如Jerry Coffin所说，std::set_intersection可用于外循环：

struct Node { SetOfIndices conNode; } typedef std::set<int> SetOfIndices; SetOfIndices setOfIndices; // instead of nodeListA for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter) { Node const & node = listOfNodes[*iter]; std::vector<int> interestingNodes; std::set_intersection(setOfIndices.begin(), setOfIndices.end(), node.conNode.begin(), node.conNode.end(), std::back_inserter(interestingNodes)); for (int j = 0; j < interestingNodes.size(); ++j) { // do stuff here } }

另一个编辑
关于效率 - 它取决于主导操作。被描述为“在这里做事”的部分的执行次数不会改变。不同之处在于遍历您的馆藏：

您的原始代码 - nodeListA.size（）^ 2 * [平均conNode大小]

我的第一个解决方案 - nodeListA.size（）* log（nodeListA.size（））* [平均conNode大小]

在Jerry Coffin建议之后 - nodeListA.size（）^ 2 * [有趣的conNode元素的平均数量]

所以似乎set_intersection使用在这种情况下无效。

Answer 2

我建议使用一个字典（一个像std::set的O（log n），或者更好的基于散列的字典，如C ++ 11中的std::unordered_set，nodeListA 。以下是C ++ 11代码示例。

#include <unordered_set>
#include <vector>

struct Node {
  std::vector<int> conNode;
};

int main()
{
  std::vector<Node>       listOfNodes;
  std::unordered_set<int> nodeListA;

  for (int node_id : nodeListA)
    for (int connected_id : listOfNodes[node_id].conNode)
      if (nodeListA.find(connected_id) != end(nodeListA))
        /* Do stuff here.. */
          ;

  return 0;
}

使用std::unordered_set的优点是查找（即搜索给定的节点id）非常快。但是，标准库中包含的实现可能不会特别快。 Google的稀疏哈希和密集哈希实现是提供相同界面的替代方案，并且已知对于大多数目的非常有用：http://code.google.com/p/sparsehash/

根据您对结果节点的要求，可以使用STL算法替换上述代码的内部循环。例如，如果要将算法标识的所有节点放在向量中，可以按如下方式对其进行编码（将其用作两个循环的替代）：

std::vector<int> results;
for (int node_id : nodeListA)
  std::copy_if(begin(listOfNodes[node_id].conNode),
               end(listOfNodes[node_id].conNode),
               back_inserter(results),
               [&nodeListA](int id){return nodeListA.find(id) != end(nodeListA);});

同样，这是C ++ 11语法;它使用lambda作为函数参数。

在没有嵌套for循环的情况下在另一个向量中查找向量条目的出现

2 个答案: