我正在尝试计算和绘制维基百科投票网络(contained into the SNAP collection of network datasets)的校外和校内分布。这是一个有向图,表示为边列表。
要阅读和存储图表数据:
%Read the data file.
G = importdata('Wiki-Vote.txt', ' ', 4);
%G is a structure that contains:
% - data: a <num_of_edges,2> matrix filled with node (wiki users) ids
% - textdata: a cell matrix that contains the header strings (first 4
% lines).
% - colheaders: a cell matrix that contains the last descriptive string
% (fourth line).
%All the useful information is contained into data matrix.
%Split directed edge list into 'from' and 'to' nodes lists.
Nfrom = G.data(:,1); %Will be used to compute out-degree
Nto = G.data(:,2); % "..." in-degree
由this question激励,我按照这种方式计算出度
%Remove duplicate entries from Nfrom and Nto lists.
Nfrom = unique(Nfrom); %Will be used to compute the outdegree distribution.
Nto = unique(Nto); %Will be used to compute the indegree distribution.
%Out-degree: count the number of occurances of each element (node-user id)
%contained into Nfrom to G.data(:,1).
outdegNsG = histc(G.data(:,1), Nfrom);
odG = hist(outdegNsG, 1:size(Nfrom));
figure;
plot(odG)
title('linear-linear scale plot: outdegree distribution');
figure;
loglog(odG)
title('log-log scale plot: outdegree distribution');
计算学位的相同事项。但我所采取的线性情节远非令人满意,让我想知道我的方法是不正确的。
线性比例:
以对数对数比例:
以线性比例放大分布图表可以清楚地表明它接近幂律:
我的问题是我的计算学位分布的方法是否正确,因为我没有任何帮助来确保这一点。具体来说,我想知道histc
中较少数量的二进制文件是否会提供更清晰的图表而不会丢失任何有价值的信息。
答案 0 :(得分:0)
好的......如果我想绘制每个节点的out(或in-)度,而不是度数分布,我之前的方法是正确的...
对于学位分配:
Nfrom = G.data(:,1); %Will be used to compute out-degree
Nfrom = unique(Nfrom); %Will be used to compute the outdegree distribution.
outdegNsG = histc(G.data(:,1), Nfrom);
outdd = histc(outdegNsG, unique(outdegNsG));
所以,我应该绘制:
loglog(1:length(outdd),outdd);
同样的不确定......