如何使用Graph :: Easy将频率分数的权重分配给图形的边缘

时间:2017-12-05 19:24:47

标签: python perl graph

如何使用Graph :: Easy为图表的边缘指定权重或频率分数?

我有一个双字母组合及其频率列表。我可以使用Graph :: Easy轻松创建仅包含bigrams的(数学)图形。输出正是我想要的。但是,当我尝试" set_attribute"我收到的错误是,"'频率'不是有效的属性名称"。我究竟做错了什么?使用Graph :: Easy,如何使频率成为有效属性?

#!/usr/bin/perl

# graph.pl - given a list of bigrams and their frequencies, output graphml

# require
use Graph::Easy;
use strict;

# initialize
my $graph = Graph::Easy->new;

# process the data
while ( <DATA> ) {

    # parse
    chop;
    my ( $nodes, $frequency ) = split( "\t", $_ );
    my ( $source, $target )   = split( ' ', $nodes );

    # update the graph
    my $edge = $graph->add_edge( $source, $target );

    # error happen here
    $edge->set_attribute( 'frequency', $frequency );

}

# output & done
print $graph->as_graphml();
exit;


# a set of bigrams and their frequencies
__DATA__
cds classroom   4
maximum registration    4
may want    3
anomalies within    2
resulting analysis  2
participants may    2
corpus without  2
journal articles    2
quickly learn   2
active reading  2
text mining     2
literally count     2
find patterns   2
14 million  2
digital humanities  2
humanities research     2

2 个答案:

答案 0 :(得分:3)

我用这个模块玩了一下,似乎它不接受任意的&#34;属性&#34;但只有一组预定义的。 显然&#39;频率&#39;不是他们。

我从documentation中挑选了一个样本并替换了您的

$edge->set_attribute( 'frequency', $frequency );

$edge->set_attribute( 'label', $frequency );

因为他们经常在示例中提到label

print $graph->as_ascii();

然后打印:

+--------------+  2   +--------------+
|      14      | ---> |   million    |
+--------------+      +--------------+
+--------------+  2   +--------------+  2   +----------+
|   digital    | ---> |  humanities  | ---> | research |
+--------------+      +--------------+      +----------+
+--------------+  2   +--------------+  3   +----------+
| participants | ---> |     may      | ---> |   want   |
+--------------+      +--------------+      +----------+
...

那是你所追求的吗?

最终我找到了Graph :: Easy的complete documentationAttributes section列出了允许的属性。我非常确定有一种方法可以拥有自定义属性,因为该模块有一个方法get_custom_attributes

答案 1 :(得分:0)

对我而言,“最佳”答案是不使用Graph :: Easy而是使用Python模块NetworkX

#!/usr/bin/python

# graph.py - given a CSV file of a specific shape, output graphml


# configure
DATA  = './data.csv'
GRAPH = './data.graphml'
LABEL = 'frequency'

# require
import networkx as nx
import csv

# initialize
g = nx.DiGraph()

# read the data
with open( DATA, 'r' ) as f :

    # initialize
    r = csv.reader( f, delimiter='\t')

    # process each record
    for source, target, frequency in r :

        # sanity check
        if len( source ) == 0 or len( target ) == 0 : continue

        # update the graph
        g.add_edge( source, target )    
        g[ source ][ target ][ LABEL ] = frequency

# output & done
nx.write_graphml( g, GRAPH )
exit()