如何使用Graph :: Easy为图表的边缘指定权重或频率分数?
我有一个双字母组合及其频率列表。我可以使用Graph :: Easy轻松创建仅包含bigrams的(数学)图形。输出正是我想要的。但是,当我尝试" set_attribute"我收到的错误是,"'频率'不是有效的属性名称"。我究竟做错了什么?使用Graph :: Easy,如何使频率成为有效属性?
#!/usr/bin/perl
# graph.pl - given a list of bigrams and their frequencies, output graphml
# require
use Graph::Easy;
use strict;
# initialize
my $graph = Graph::Easy->new;
# process the data
while ( <DATA> ) {
# parse
chop;
my ( $nodes, $frequency ) = split( "\t", $_ );
my ( $source, $target ) = split( ' ', $nodes );
# update the graph
my $edge = $graph->add_edge( $source, $target );
# error happen here
$edge->set_attribute( 'frequency', $frequency );
}
# output & done
print $graph->as_graphml();
exit;
# a set of bigrams and their frequencies
__DATA__
cds classroom 4
maximum registration 4
may want 3
anomalies within 2
resulting analysis 2
participants may 2
corpus without 2
journal articles 2
quickly learn 2
active reading 2
text mining 2
literally count 2
find patterns 2
14 million 2
digital humanities 2
humanities research 2
答案 0 :(得分:3)
我用这个模块玩了一下,似乎它不接受任意的&#34;属性&#34;但只有一组预定义的。 显然&#39;频率&#39;不是他们。
我从documentation中挑选了一个样本并替换了您的
$edge->set_attribute( 'frequency', $frequency );
带
$edge->set_attribute( 'label', $frequency );
因为他们经常在示例中提到label
。
print $graph->as_ascii();
然后打印:
+--------------+ 2 +--------------+
| 14 | ---> | million |
+--------------+ +--------------+
+--------------+ 2 +--------------+ 2 +----------+
| digital | ---> | humanities | ---> | research |
+--------------+ +--------------+ +----------+
+--------------+ 2 +--------------+ 3 +----------+
| participants | ---> | may | ---> | want |
+--------------+ +--------------+ +----------+
...
那是你所追求的吗?
最终我找到了Graph :: Easy的complete documentation。 Attributes section列出了允许的属性。我非常确定有一种方法可以拥有自定义属性,因为该模块有一个方法get_custom_attributes
。
答案 1 :(得分:0)
对我而言,“最佳”答案是不使用Graph :: Easy而是使用Python模块NetworkX:
#!/usr/bin/python
# graph.py - given a CSV file of a specific shape, output graphml
# configure
DATA = './data.csv'
GRAPH = './data.graphml'
LABEL = 'frequency'
# require
import networkx as nx
import csv
# initialize
g = nx.DiGraph()
# read the data
with open( DATA, 'r' ) as f :
# initialize
r = csv.reader( f, delimiter='\t')
# process each record
for source, target, frequency in r :
# sanity check
if len( source ) == 0 or len( target ) == 0 : continue
# update the graph
g.add_edge( source, target )
g[ source ][ target ][ LABEL ] = frequency
# output & done
nx.write_graphml( g, GRAPH )
exit()