我使用python标准库模块 re 来搜索graphviz图定义中的节点:
import re
graphviz_node_regex = ??? # this regex is unknown
for m in re.finditer(graphviz_node_regex, string_containing_one_graphviz_diagram):
# e.g. do something with the whole string of the match here
match_string = m.group(0)
在graphviz图中,我想找到可能具有以下语法备选方案的节点(node1,node2,node3及其href / URL / target的组合):
digraph foo {
...
node1 -> node2;
node2 -> node3;
...
node1 [label="node1 name"];
node2 [label="node2 name", href="../one_dir_with_underscores/file_with_underscores.html"];
node3 [label="node3 name", href="../one_dir_with_underscores/file_with_underscores.html", target="_top"];
node4 [label="node4 name", URL="../onedirwithoutunderscores/file_with_references.html#reference-with-separators", target="_top"];
somenamewithoutseparators [label="name of node without separators"];
some_name_with_separators [label="name of node with separators"];
...
}
所有graphviz节点备选方案的正则表达式是什么? 每种语法替代的单个正则表达式是什么?
e.g。单语法备选的正则表达式:
\w*\s\[label="\w*\s\w*"\];
\w*\s\[label="\w*\s\w*", href="../\w*/\w*.html"];
\w*\s\[label="\w*\s\w*", href="../\w*/\w*.html",\starget="_top"];