Question

我有以下代码，它通过188k x 188k行数据的矩阵，并尝试从中创建网络图。这里的问题是我的算法非常慢（正如预期的那样，它的二次方）。有没有更好的方法来做到这一点，我没有看到？我已经在考虑使用openMP来并行化这个，但如果我不必这样做会很棒。

这里有关于我的矩阵的真实性 - 它的对称性，超过188,000乘188 188，矩阵中的每个值对应边缘权重因此，例如，元素aij是i之间边缘的权重和j。这是我的代码：

图表创建：

typedef boost::adjacency_list
<
boost::vecS,
boost::vecS,
boost::undirectedS,
boost::property<boost::vertex_name_t, std::string>,
boost::property<boost::edge_weight_t, float>,
boost::property<boost::graph_name_t, std::string>
> UGraph;

typedef UGraph::vertex_descriptor vertex_t;
typedef UGraph::edge_descriptor edge_t;

现在创建网络的功能：

    vertex_t u;
    vertex_t v;
    edge_t e;

    bool found=0;
    int idx =0;

    float cos_similarity;
    for(int p =1;p<=adj_matrix.cols();p++){

            //using a previously created vector to track already created nodes
            if(std::find(created_nodes.begin(), created_nodes.end(), nodes[idx]) == created_nodes.end()){
                    u = add_vertex(nodes[idx], ug);
                    created_nodes.push_back(nodes[idx]);
            }else{
                    u = vertex(p,ug);
            }
            int jdx = 0;

            for(int q =1;q<=adj_matrix.cols();q++){

                    if(p!=q){//NO LOOPS IN THIS GRAPH
                            //using a previously created vector to track already created nodes
                            if(std::find(created_nodes.begin(), created_nodes.end(), nodes[jdx]) == created_nodes.end()){
                            v = add_vertex(nodes[jdx], ug);
                    created_nodes.push_back(nodes[jdx]);
                            }else{
                                    u = vertex(q,ug);
                            }


                            tie(e, found) = edge(u, v, ug);

                            if(!found){//check that edge does not already exist
                                    cos_similarity = adj_matrix(p,q);
                                    fil<<cos_similarity<<endl;
                                    fil.flush();
                                    if(cos_similarity >= 0.2609){ //only add edge if value of cell is greater than this threshold
                                            boost::add_edge(u,v,cos_similarity, ug);
                                            edge_out<<p<<" "<<q<<" "<<cos_similarity<<endl; //creating an edge-weight list for later use
                                    }
                            }
                    }
                    jdx++;
            }
            idx++;
    }

Answer 1

一个简单的提示：

我认为你的算法是立方而不是二次，因为vector和std :: find（vector.begin（），vector.end（））用于避免内部循环中的重复。
为了避免重复并保持算法的四边形，您只需遍历矩阵的上三角形，因为它是对称的，这意味着图形是无向图。

优化网络图创建

1 个答案: