如何在python中编写用于链接预测精度评估的代码?

时间:2019-01-11 13:11:36

标签: python precision networkx prediction

我正在使用adamic_adar索引进行链接预测问题。数据集是一个网格网络(具有1000个链接的边列表)。我从观察到的数据集中随机选择了80%(800)的边缘。我需要从如下所示的preds中选择最高的200条预测链接,并计算出准确率。我不知道下一步该怎么做。我该怎么办..帮助!

import numpy as np
import networkx as nx


G = nx.read_edgelist('Grid.txt', create_using=nx.Graph(), nodetype=int)
preds = nx.adamic_adar_index(G);
for u, v, p in preds:
    '(%d, %d) -> %.8f' % (u, v, p)
    print(u, v, p)

2 个答案:

答案 0 :(得分:1)

我假设u,v是图形的顶点,p是精度。

import numpy as np
import networkx as nx
import random

G = nx.read_edgelist('Grid.txt', create_using=nx.Graph(), nodetype=int)
preds = nx.adamic_adar_index(G)
preds = random.sample(preds, int(len(preds)*0.8))
preds = sorted(preds, key=lambda x: x[2], reverse=True)[:200]
ratio = sum([t[2] for t in preds])/len(preds)
print(ratio)

答案 1 :(得分:0)

import numpy as np
import networkx as nx
import random

G = nx.read_edgelist('Grid.txt', create_using=nx.Graph(), nodetype=int)
preds = list(nx.adamic_adar_index(G))
preds = random.sample(preds, int(len(preds)*0.8))
preds = sorted(preds, key=lambda x: x[2], reverse=True)[:200]
ratio = sum([t[2] for t in preds])/len(preds)
print(ratio)