networkx说我的节点少于实际的节点

时间:2019-07-03 14:54:18

标签: python graph networkx

我对NetworkX有一个奇怪的问题。
给定DS-1 dataset,我的任务是每年创建一个在数据集中报告的图表。到目前为止,完全没有问题。对于2013年,这就是我得到的

enter image description here

我们可以说...有点拥挤。
现在,这是我的奇怪问题。我的作业指出,应该以某种逻辑选择每个图的前k个节点。因此,由于我有一些图的节点少于5个(根据要求,此k将是[0,5,10,50,200]中的值),因此我认为在迭代中应排除len(G )

for x in graphsPerYear:
    G = graphsPerYear[x]
    if len(G) < k:
        print(G.nodes)
        print(G.number_of_nodes())
        print("Skipping year " + str(x) + " since it has " + str(len(G)) + " nodes which is less than the prompted k")
        continue

这将输出以下内容:

['linear matrix inequality', 'social inequality']
2
Skipping year 2013 since it has 2 nodes which is less than the prompted k

但是图像告诉我们完全相反。我想念什么?

编辑

添加图的创建

def createGraphPerYear(dataset, year):
    insertedWords = set()
    listaAnni = set(dataset['anno'].values)
    grafi = dict()
    for anno in listaAnni:
        datasetTemporale = dataset[dataset['anno'] == anno]
        G=nx.DiGraph()
        for index, row in datasetTemporale.iterrows():
            #Reminder: ogni row è formato da anno, keyword1, keyword2, dizionario utilizzatore keywords - numero volte
            #FASE 1: AGGIUNTA DEI DUE POSSIBILI NODI
            if row.keyword1 not in G:
                G.add_node(row.keyword1)
            if row.keyword2 not in G:
                G.add_node(row.keyword2)
            if not __areNodesConnected(G,row.keyword1, row.keyword2):
                G.add_edge(row.keyword1,row.keyword2)
        grafi[anno] = G
    return grafi

def __areNodesConnected(G, nodeToCheckOne,nodeToCheckTwo):
    return nodeToCheckOne in G.neighbors(nodeToCheckTwo)

1 个答案:

答案 0 :(得分:0)

将节点添加到networx时,它会hash对其进行确定唯一性。 具有相同hash的任何节点都被确定为相同。

By definition, a Graph is a collection of nodes (vertices) 
along with identified pairs of nodes (called edges, links, etc). 
In NetworkX, nodes can be any hashable object e.g., 
a text string, an image, an XML object, another Graph, 
a customized node object, etc.

再次检查这些项目不是同一字符串,或者对于不同的节点,它们的哈希性不相同。