我想获得没有特定关系的节点(与特定属性的关系)。
图表包含实体节点(n),它们出现在文件中的特定行(line_nr)(f)。
我当前的查询如下:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( n4.text? =~ "nonreachablenodestextregex" AND (p4 = null OR left.line_nr < right4.line_nr - 0 OR left.line_nr > right4.line_nr + 0 OR ID(left) = ID(right4) ) ) )
AND (
( n7.text? =~ "othernonreachablenodestextregex" AND (p7 = null OR left.line_nr < right7.line_nr - 0 OR left.line_nr > right7.line_nr + 0 OR ID(left) = ID(right7) ) ) ) )
WITH n, left, f, count(*) as group_by_cause
RETURN ID(left) as occ_id,
n.text as ent_text,
substring(f.text, ABS(left.file_offset-1), 2 + LENGTH(n.text) ) as occ_text,
f.path as file_path,
left.line_nr as occ_line_nr,
ID(f) as file_id
而不是MATCH子句中的新路径,我认为也可以:
NOT ( (f)<-[right4:OCCURS]-(n4) )
但是,我不想排除任何路径的存在,而是排除特定的路径。
作为替代解决方案,我想要包含额外的起始节点(因为我在不可访问的节点上有索引),以删除WHERE子句中的文本比较。但是,如果neo4j中没有与通配符匹配的节点,则不会返回任何内容。
start n=node:entities("text:*")
, n4=node:entities("text:nonreachablenodestextwildcard")
, n7=node:entities("text:othernonreachablenodestextwildcard")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE ( (
( (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4) ) ) )
AND (
( (p7 = null
OR left.line_nr < right7.line_nr - 0
OR left.line_nr > right7.line_nr + 0
OR ID(left) = ID(right7) ) )
) )
旧更新: 如答案中所述,我可以使用谓词函数来构造内部查询。因此,我将查询更新为:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE ( (
(NONE(path in (f)<-[:OCCURS]-(n4)
WHERE
(LAST(nodes(path))).text =~ "nonreachablenodestextRegex"
AND FIRST(r4 in rels(p)).line_nr <= left.line_nr
AND FIRST(r4 in rels(p)).line_nr >= left.line_nr
)
) )
AND (
(NONE(path in (f)<-[:OCCURS]-(n7)
WHERE
(LAST(nodes(path))).text =~ "othernonreachablenodestextRegex"
AND FIRST(r7 in rels(p)).line_nr <= left.line_nr
AND FIRST(r7 in rels(p)).line_nr >= left.line_nr
)
) )
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....
这给了我一个java.lang.OutOfMemoryException
:
java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Pattern.compile(Pattern.java:1432)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at scala.util.matching.Regex.<init>(Regex.scala:38)
at scala.collection.immutable.StringLike$class.r(StringLike.scala:226)
at scala.collection.immutable.StringOps.r(StringOps.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCase(Base.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCases(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at scala.util.parsing.combinator.Parsers$Parser.p$3(Parsers.scala:209)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:183)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
(最后6行重复几次)
解决方案 上一次更新可能在某处包含语法错误,修复程度略有不同,如下所示:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
WHERE (
(NONE ( path in (f)<-[:OCCURS]-()
WHERE
ANY(n4 in nodes(path)
WHERE ID(n4) <> ID(n)
AND n4.type = 'ENTITY'
AND n4.text =~ "a regex expr"
)
AND ALL(r4 in rels(path)
WHERE r4.line_nr <= left.line_nr + 0
AND r4.line_nr >= left.line_nr - 0
)
)
) )
AND
NONE ( ...... )
WITH n, left, f, count(*) as group_by_cause
RETURN ...
然而它很慢。小图的秒数(> 10): 总共4个实体 - 节点和6:OCCURS关系,全部到1个单个目标f节点,line_nr在0和3之间。
效果更新 以下是大约两倍的速度:
start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f)
, p4=(f)<-[right4?:OCCURS]-(n4)
, p7=(f)<-[right7?:OCCURS]-(n7)
WHERE
( n4.text? =~ "regex1"
AND (p4 = null
OR left.line_nr < right4.line_nr - 0
OR left.line_nr > right4.line_nr + 0
OR ID(left) = ID(right4)
)
)
AND
( n7.text? =~ "regex2"
AND (p7 = null .....)
)
WITH n, left, f, count(*) as group_by_cause
RETURN ....
答案 0 :(得分:0)
我认为您应该使用WHERE中的模式谓词而不是可选关系。模式表达式实际上返回一组路径,因此您可以执行集合谓词,如(ALL, NONE, ANY, SINGLE
)
WHERE NONE(path in (f)<-[:OCCURS]-(n4) WHERE
ALL(r in rels(p) : r.line_nr = 42 ))
请参阅:http://docs.neo4j.org/chunked/milestone/query-function.html#_predicates