我有以下图形:
已添加了顶点和边,如下所示:
def graph=ConfiguredGraphFactory.open('Baptiste');def g = graph.traversal();
graph.addVertex(label, 'Group', 'text', 'BNP Paribas');
graph.addVertex(label, 'Group', 'text', 'BNP PARIBAS');
graph.addVertex(label, 'Company', 'text', 'JP Morgan Chase');
graph.addVertex(label, 'Location', 'text', 'France');
graph.addVertex(label, 'Location', 'text', 'United States');
graph.addVertex(label, 'Location', 'text', 'Europe');
def v1 = g.V().has('text', 'JP Morgan Chase').next();def v2 = g.V().has(text, 'BNP Paribas').next();v1.addEdge('partOf',v2);
def v1 = g.V().has('text', 'JP Morgan Chase').next();def v2 = g.V().has(text, 'United States').next();v1.addEdge('doesBusinessIn',v2);
def v1 = g.V().has('text', 'BNP Paribas').next();def v2 = g.V().has(text, 'United States').next();v1.addEdge('doesBusinessIn',v2);
def v1 = g.V().has('text', 'BNP Paribas').next();def v2 = g.V().has(text, 'France').next();v1.addEdge('partOf',v2);
def v1 = g.V().has('text', 'BNP PARIBAS').next();def v2 = g.V().has(text, 'Europe').next();v1.addEdge('partOf',v2);
我需要一个查询,该查询会在给定特定的顶点标签,边缘标签和可能的跳数的情况下,向我返回所有可能的路径。 假设在此示例中,我需要最大跳数为2的路径以及每个标签。我尝试了以下查询:
def graph=ConfiguredGraphFactory.open('TestGraph');
def g = graph.traversal();
g.V().has(label, within('Location', 'Company', 'Group'))
.repeat(bothE().has(label, within('doesBusinessIn', 'partOf')).bothV().has(label, within('Location', 'Company', 'Group')).simplePath())
.emit().times(2).path();
此查询返回20条路径(假定返回10条路径)。因此它返回2个可能方向上的路径。 有没有一种方法可以指定只需要1个方向?我尝试在查询中添加dedup()
,但是它返回的是7条路径而不是10条,所以它不起作用?
每当我尝试查找具有4个跃点的路径时,它都不会返回“ France -> BNP Paribas -> United States -> JP Morgan Chase -> BNP Paribas
”之类的“循环”路径。 您知道要在查询中添加哪些内容以允许返回此类路径吗?
编辑: 感谢您的解决方案@DanielKuppitz。似乎正是我要的东西。
我使用构建在Apache Tinkerpop之上的JanusGraph: 我尝试了第一个查询:
g.V().hasLabel('Location', 'Company', 'Group').
repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
emit().times(2).
path().
dedup().
by(unfold().order().by(id).fold())
它引发了以下错误:
Error: org.janusgraph.graphdb.relations.RelationIdentifier cannot be cast to java.lang.Comparable
所以我移动了dedup
命令。像这样进入重复循环:
g.V().hasLabel('Location', 'Company', 'Group').
repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath().dedup().by(unfold().order().by(id).fold())).
emit().times(2).
path().
它仅返回6条路径:
[
[
"JP Morgan Chase",
"doesBusinessIn",
"United States"
],
[
"JP Morgan Chase",
"partOf",
"BNP Paribas"
],
[
"JP Morgan Chase",
"partOf",
"BNP Paribas",
"partOf",
"France"
],
[
"Europe",
"partOf",
"BNP PARIBAS"
],
[
"BNP PARIBAS",
"partOf",
"Europe"
],
[
"United States",
"doesBusinessIn",
"JP Morgan Chase"
]
]
我不确定这是怎么回事...有什么想法吗?
答案 0 :(得分:3)
是否可以指定仅需要1个方向?
您还需要双向遍历,因此您最终必须过滤重复的路径(在这种情况下,“重复的”意味着2条路径包含相同的元素)。为此,您可以按元素的确定顺序dedup()
进行路径操作;最简单的方法是按元素id
对其进行排序。
g.V().hasLabel('Location', 'Company', 'Group').
repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
emit().times(2).
path().
dedup().
by(unfold().order().by(id).fold())
您知道要在查询中添加什么以允许返回此类路径(循环)吗?
您的查询明确地阻止了simplePath()
步骤中的循环路径,因此,在哪种情况下要允许它们并不清楚。我假设如果循环仅由路径中的第一个和最后一个元素创建,那么您可以使用循环路径。在这种情况下,查询将更像这样:
g.V().hasLabel('Location', 'Company', 'Group').as('a').
repeat(bothE('doesBusinessIn', 'partOf').otherV()).
emit().
until(loops().is(4).or().cyclicPath()).
filter(simplePath().or().where(eq('a'))).
path().
dedup().
by(unfold().order().by(id).fold())
下面是这2个查询的输出(忽略多余的map()
步骤,它只是为了提高输出的可读性)。
gremlin> g.V().hasLabel('Location', 'Company', 'Group').
......1> repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
......2> emit().times(2).
......3> path().
......4> dedup().
......5> by(unfold().order().by(id).fold()).
......6> map(unfold().coalesce(values('text'), label()).fold())
==>[BNP Paribas,doesBusinessIn,United States]
==>[BNP Paribas,partOf,France]
==>[BNP Paribas,partOf,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[BNP PARIBAS,partOf,Europe]
==>[JP Morgan Chase,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,France]
==>[France,partOf,BNP Paribas,doesBusinessIn,United States]
gremlin> g.V().hasLabel('Location', 'Company', 'Group').as('a').
......1> repeat(bothE('doesBusinessIn', 'partOf').otherV()).
......2> emit().
......3> until(loops().is(4).or().cyclicPath()).
......4> filter(simplePath().or().where(eq('a'))).
......5> path().
......6> dedup().
......7> by(unfold().order().by(id).fold()).
......8> map(unfold().coalesce(values('text'), label()).fold())
==>[BNP Paribas,doesBusinessIn,United States]
==>[BNP Paribas,partOf,France]
==>[BNP Paribas,partOf,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,BNP Paribas]
==>[BNP Paribas,partOf,France,partOf,BNP Paribas]
==>[BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[BNP Paribas,partOf,JP Morgan Chase,partOf,BNP Paribas]
==>[BNP Paribas,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase,partOf,BNP Paribas]
==>[BNP PARIBAS,partOf,Europe]
==>[BNP PARIBAS,partOf,Europe,partOf,BNP PARIBAS]
==>[JP Morgan Chase,doesBusinessIn,United States]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,JP Morgan Chase]
==>[JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,France]
==>[JP Morgan Chase,partOf,BNP Paribas,partOf,JP Morgan Chase]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,BNP Paribas,partOf,France]
==>[JP Morgan Chase,doesBusinessIn,United States,doesBusinessIn,BNP Paribas,partOf,JP Morgan Chase]
==>[France,partOf,BNP Paribas,doesBusinessIn,United States]
==>[France,partOf,BNP Paribas,partOf,France]
==>[France,partOf,BNP Paribas,partOf,JP Morgan Chase,doesBusinessIn,United States]
==>[United States,doesBusinessIn,JP Morgan Chase,doesBusinessIn,United States]
==>[United States,doesBusinessIn,BNP Paribas,doesBusinessIn,United States]
==>[United States,doesBusinessIn,JP Morgan Chase,partOf,BNP Paribas,doesBusinessIn,United States]
==>[Europe,partOf,BNP PARIBAS,partOf,Europe]
更新(基于最新评论)
由于JanusGraph具有不可比较的边缘标识符,因此在所有边缘上都需要一个唯一的可比较属性。这可以像随机的UUID一样简单。
这是我更新示例图的方式:
g.addV('Group').property('text', 'BNP Paribas').as('a').
addV('Group').property('text', 'BNP PARIBAS').as('b').
addV('Company').property('text', 'JP Morgan Chase').as('c').
addV('Location').property('text', 'France').as('d').
addV('Location').property('text', 'United States').as('e').
addV('Location').property('text', 'Europe').as('f').
addE('partOf').from('c').to('a').
property('uuid', UUID.randomUUID().toString()).
addE('doesBusinessIn').from('c').to('e').
property('uuid', UUID.randomUUID().toString()).
addE('doesBusinessIn').from('a').to('e').
property('uuid', UUID.randomUUID().toString()).
addE('partOf').from('a').to('d').
property('uuid', UUID.randomUUID().toString()).
addE('partOf').from('b').to('f').
property('uuid', UUID.randomUUID().toString()).
iterate()
现在,我们拥有可以唯一地标识边的属性,因此在所有顶点上我们还需要唯一的属性(具有相同的数据类型)。幸运的是,现有的text
属性似乎已经足够好了(否则,它与边缘的情况一样-只需添加一个随机UUID)。更新后的查询现在看起来像这样:
g.V().hasLabel('Location', 'Company', 'Group').
repeat(bothE('doesBusinessIn', 'partOf').otherV().simplePath()).
emit().times(2).
path().
dedup().
by(unfold().values('text','uuid').order().fold())
g.V().hasLabel('Location', 'Company', 'Group').as('a').
repeat(bothE('doesBusinessIn', 'partOf').otherV()).
emit().
until(loops().is(4).or().cyclicPath()).
filter(simplePath().or().where(eq('a'))).
path().
dedup().
by(unfold().values('text','uuid').order().fold())
结果当然与上面相同。