在我的Neo4j / SDN 4应用程序中,我的所有Cypher查询都基于内部Neo4j ID。
这是一个问题,因为我无法在我的网络应用网址上依赖这些ID。 Neo4j可以重复使用这些ID,因此很有可能在将来的某个时间,我们可以找到相同的ID。
我尝试根据以下解决方案重新实现此逻辑:Using the graph to control unique id generation但注意到查询性能下降。
从理论的角度来看,如果Cypher根据@Index(unique = true, primary = true
的属性查询
例如:
@Index(unique = true, primary = true)
private Long uid;
entity.uid = {someId}
与Cypher查询具有相同的性能,该查询基于内部Neo4j ID:
id(entity) = {someId}
已更新
这是:schema
输出:
Indexes
ON :BaseEntity(uid) ONLINE
ON :Characteristic(lowerName) ONLINE
ON :CharacteristicGroup(lowerName) ONLINE
ON :Criterion(lowerName) ONLINE
ON :CriterionGroup(lowerName) ONLINE
ON :Decision(lowerName) ONLINE
ON :FlagType(name) ONLINE (for uniqueness constraint)
ON :HAS_VALUE_ON(value) ONLINE
ON :HistoryValue(originalValue) ONLINE
ON :Permission(code) ONLINE (for uniqueness constraint)
ON :Role(name) ONLINE (for uniqueness constraint)
ON :User(email) ONLINE (for uniqueness constraint)
ON :User(username) ONLINE (for uniqueness constraint)
ON :Value(value) ONLINE
Constraints
ON ( flagtype:FlagType ) ASSERT flagtype.name IS UNIQUE
ON ( permission:Permission ) ASSERT permission.code IS UNIQUE
ON ( role:Role ) ASSERT role.name IS UNIQUE
ON ( user:User ) ASSERT user.email IS UNIQUE
ON ( user:User ) ASSERT user.username IS UNIQUE
正如您所看到的,我在:BaseEntity(uid)
BaseEntity
是我的实体层次结构中的基类,例如:
@NodeEntity
public abstract class BaseEntity {
@GraphId
private Long id;
@Index(unique = false)
private Long uid;
private Date createDate;
private Date updateDate;
...
}
@NodeEntity
public class Commentable extends BaseEntity {
...
}
@NodeEntity
public class Decision extends Commentable {
private String name;
}
在我寻找uid
的示例时,会使用此(d:Decision) WHERE d.uid = {uid}
索引吗?
PROFILE结果 - 内部ID与索引属性
基于内部ID的查询
PROFILE MATCH (parentD)-[:CONTAINS]->(childD:Decision)
WHERE id(parentD) = 1474333
MATCH (childD)-[relationshipValueRel1475199:HAS_VALUE_ON]-(filterCharacteristic1475199)
WHERE id(filterCharacteristic1475199) = 1475199
WITH relationshipValueRel1475199, childD
WHERE ([1, 19][0] <= relationshipValueRel1475199.value <= [1, 19][1] )
WITH childD
MATCH (childD)-[relationshipValueRel1474358:HAS_VALUE_ON]-(filterCharacteristic1474358)
WHERE id(filterCharacteristic1474358) = 1474358
WITH relationshipValueRel1474358, childD
WHERE (ANY (id IN ['Compact'] WHERE id IN relationshipValueRel1474358.value ))
WITH childD
MATCH (childD)-[relationshipValueRel1475193:HAS_VALUE_ON]-(filterCharacteristic1475193)
WHERE id(filterCharacteristic1475193) = 1475193
WITH relationshipValueRel1475193, childD
WHERE (ANY (id IN ['16:9', '3:2', '4:3', '1:1']
WHERE id IN relationshipValueRel1475193.value ))
WITH childD
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c)
WHERE id(c) IN [1474342, 1474343, 1474340, 1474339, 1474336, 1474352, 1474353, 1474350, 1474351, 1474348, 1474346, 1474344]
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User)
WITH ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD) WHERE NOT ((ch1)<-[:DEPENDS_ON]-()) | {characteristicId: id(ch1), value: v1.value, totalHistoryValues: toInt(v1.totalHistoryValues), description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
PROFILE输出:
Cypher版本:CYPHER 3.1,规划师:COST,运行时间:解释。在238毫秒内350554总db命中率。
基于索引属性uid的查询
PROFILE MATCH (parentD)-[:CONTAINS]->(childD:Decision)
WHERE parentD.uid = 61
MATCH (childD)-[relationshipValueRel1475199:HAS_VALUE_ON]-(filterCharacteristic1475199)
WHERE filterCharacteristic1475199.uid = 15
WITH relationshipValueRel1475199, childD
WHERE ([1, 19][0] <= relationshipValueRel1475199.value <= [1, 19][1] )
WITH childD
MATCH (childD)-[relationshipValueRel1474358:HAS_VALUE_ON]-(filterCharacteristic1474358)
WHERE filterCharacteristic1474358.uid = 10
WITH relationshipValueRel1474358, childD
WHERE (ANY (id IN ['Compact'] WHERE id IN relationshipValueRel1474358.value ))
WITH childD
MATCH (childD)-[relationshipValueRel1475193:HAS_VALUE_ON]-(filterCharacteristic1475193)
WHERE filterCharacteristic1475193.uid = 14
WITH relationshipValueRel1475193, childD
WHERE (ANY (id IN ['16:9', '3:2', '4:3', '1:1']
WHERE id IN relationshipValueRel1475193.value ))
WITH childD
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c)
WHERE c.uid IN [26, 27, 24, 23, 20, 36, 37, 34, 35, 32, 30, 28]
WITH childD, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User)
WITH ru, u, childD , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
RETURN ru, u, childD AS decision, weight, totalVotes,
[ (parentD)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: id(entity), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (parentD)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: id(c1), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria,
[ (parentD)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD) WHERE NOT ((ch1)<-[:DEPENDS_ON]-()) | {characteristicId: id(ch1), value: v1.value, totalHistoryValues: toInt(v1.totalHistoryValues), description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
Cypher版本:CYPHER 3.1,规划师:COST,运行时间:解释。 671326总db命中率为426 ms。
有没有机会根据uid提高性能?
答案 0 :(得分:5)
您不应在网址中使用Neo4j内部ID,因为删除节点后可以重复使用它们。
从性能的角度来看,内部id尽可能快 - 它实际上是带有节点/关系记录的文件中的偏移量(您可能已经注意到这些是2个独立的id序列,您可以使用id = z和相同的id = x的关系。
任何索引的使用都必须更慢,因为数据库首先进行索引查找,获取内部标识,然后读取节点记录。
然而,绝大多数应用程序性能差异可以忽略不计 - 可能比网络延迟或一般OGM开销小得多。
如果你看到明显的差异
:schema
)info
设置org.neo4j.ogm
级别)PROFILE
检查查询计划<强>已更新强>
是的,索引将用于以下查询:
MATCH (d:Decision) WHERE d.uid = {uid} ...
应该由
生成session.load(Decision.class, uid)
如果您的索引是主要的,或findByUid
上的DecisionRepository
。
请注意,当where子句出现在查询中间时,可能不会使用该索引:
...
WITH x
MATCH (x)-[...]-(d) WHERE d.uid = {uid} ...
这取决于查询计划,您应该使用PROFILE
来调查此问题。