如何验证密码查询中不存在特定路径

时间:2013-03-26 11:49:21

标签: neo4j cypher

我想获得没有特定关系的节点(与特定属性的关系)。

图表包含实体节点(n),它们出现在文件中的特定行(line_nr)(f)。

我当前的查询如下:

start n=node:entities("text:*")
MATCH p=(n)-[left:OCCURS]->(f) 
    , p4=(f)<-[right4?:OCCURS]-(n4) 
    , p7=(f)<-[right7?:OCCURS]-(n7)
WHERE  (  (  
   ( n4.text? =~ "nonreachablenodestextregex" AND (p4 = null OR left.line_nr < right4.line_nr - 0 OR left.line_nr > right4.line_nr + 0 OR ID(left) = ID(right4)  )  )  )  
   AND  (  
   ( n7.text? =~ "othernonreachablenodestextregex" AND (p7 = null OR left.line_nr < right7.line_nr - 0 OR left.line_nr > right7.line_nr + 0 OR ID(left) = ID(right7)  )  )  )  )  
WITH n, left, f, count(*) as group_by_cause 
RETURN ID(left) as occ_id, 
       n.text as ent_text, 
       substring(f.text, ABS(left.file_offset-1), 2 + LENGTH(n.text) ) as occ_text, 
       f.path as file_path, 
       left.line_nr as occ_line_nr, 
       ID(f) as file_id

而不是MATCH子句中的新路径,我认为也可以:

NOT ( (f)<-[right4:OCCURS]-(n4) ) 

但是,我不想排除任何路径的存在,而是排除特定的路径。

作为替代解决方案,我想要包含额外的起始节点(因为我在不可访问的节点上有索引),以删除WHERE子句中的文本比较。但是,如果neo4j中没有与通配符匹配的节点,则不会返回任何内容。

start n=node:entities("text:*")
    , n4=node:entities("text:nonreachablenodestextwildcard")
    , n7=node:entities("text:othernonreachablenodestextwildcard")
MATCH p=(n)-[left:OCCURS]->(f) 
    , p4=(f)<-[right4?:OCCURS]-(n4) 
    , p7=(f)<-[right7?:OCCURS]-(n7)
WHERE  (  (  
   ( (p4 = null 
      OR left.line_nr < right4.line_nr - 0 
      OR left.line_nr > right4.line_nr + 0 
      OR ID(left) = ID(right4)  )  )  )  
   AND  (  
   ( (p7 = null 
      OR left.line_nr < right7.line_nr - 0 
      OR left.line_nr > right7.line_nr + 0 
      OR ID(left) = ID(right7)  )  ) 
)  )  

旧更新: 如答案中所述,我可以使用谓词函数来构造内部查询。因此,我将查询更新为:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
WHERE  (  (  
    (NONE(path in (f)<-[:OCCURS]-(n4) 
         WHERE  
           (LAST(nodes(path))).text =~ "nonreachablenodestextRegex"  
           AND FIRST(r4 in rels(p)).line_nr <= left.line_nr  
           AND FIRST(r4 in rels(p)).line_nr >= left.line_nr 
    )
    ) )  
    AND  (  
   (NONE(path in (f)<-[:OCCURS]-(n7) 
         WHERE  
           (LAST(nodes(path))).text =~ "othernonreachablenodestextRegex"  
           AND FIRST(r7 in rels(p)).line_nr <= left.line_nr   
           AND FIRST(r7 in rels(p)).line_nr >= left.line_nr 
    )
    ) ) 
    )
WITH n, left, f, count(*) as group_by_cause 
RETURN ....

这给了我一个java.lang.OutOfMemoryException

java.lang.OutOfMemoryError: Java heap space
at java.util.regex.Pattern.compile(Pattern.java:1432)
at java.util.regex.Pattern.<init>(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:823)
at scala.util.matching.Regex.<init>(Regex.scala:38)
at scala.collection.immutable.StringLike$class.r(StringLike.scala:226)
at scala.collection.immutable.StringOps.r(StringOps.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCase(Base.scala:31)
at org.neo4j.cypher.internal.parser.v1_9.Base.ignoreCases(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at org.neo4j.cypher.internal.parser.v1_9.Base$$anonfun$ignoreCases$1.apply(Base.scala:49)
at scala.util.parsing.combinator.Parsers$Parser.p$3(Parsers.scala:209)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:183)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$1.apply(Parsers.scala:210)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:163)

(最后6行重复几次)

解决方案 上一次更新可能在某处包含语法错误,修复程度略有不同,如下所示:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
WHERE  (    
(NONE ( path in (f)<-[:OCCURS]-() 
    WHERE  
      ANY(n4 in nodes(path) 
        WHERE ID(n4) <> ID(n) 
          AND n4.type = 'ENTITY' 
          AND n4.text =~ "a regex expr" 
        ) 
      AND ALL(r4 in rels(path)  
        WHERE r4.line_nr <= left.line_nr + 0 
          AND r4.line_nr >= left.line_nr - 0 
        ) 
     )
  ) )
AND  
 NONE ( ......  )  
WITH n, left, f, count(*) as group_by_cause 
RETURN ...

然而它很慢。小图的秒数(> 10): 总共4个实体 - 节点和6:OCCURS关系,全部到1个单个目标f节点,line_nr在0和3之间。

效果更新 以下是大约两倍的速度:

start n=node:entities("text:*") 
MATCH p=(n)-[left:OCCURS]->(f) 
, p4=(f)<-[right4?:OCCURS]-(n4) 
, p7=(f)<-[right7?:OCCURS]-(n7) 
WHERE  
 ( n4.text? =~ "regex1" 
  AND (p4 = null 
       OR left.line_nr < right4.line_nr - 0 
       OR left.line_nr > right4.line_nr + 0 
       OR ID(left) = ID(right4)  
       )
 ) 
 AND  
( n7.text? =~ "regex2" 
  AND (p7 = null .....)
)
WITH n, left, f, count(*) as group_by_cause 
RETURN ....

1 个答案:

答案 0 :(得分:0)

我认为您应该使用WHERE中的模式谓词而不是可选关系。模式表达式实际上返回一组路径,因此您可以执行集合谓词,如(ALL, NONE, ANY, SINGLE

WHERE NONE(path in (f)<-[:OCCURS]-(n4) WHERE 
           ALL(r in rels(p) : r.line_nr = 42 ))

请参阅:http://docs.neo4j.org/chunked/milestone/query-function.html#_predicates