我想用最新的spring-data neo4j 4来测试插入速度。 我修改了电影示例,使事情变得简单和可比。
尝试运行测试类: movies.spring.data.neo4j.repositories.PersonRepositoryTest here。
在此示例中,添加400个节点需要5秒。 https://github.com/fodon/neo4j-spring-data-speed-demo
这是使用旧版neo4j的速度测试 https://github.com/fodon/gs-accessing-data-neo4j-speed
对于同一个工作,hello.Application类比spring-data-neo4j-4快约40倍。
为什么spring-data-neo4j-4比旧版本慢? 如何加速?
答案 0 :(得分:5)
对save()
的调用实际上是对数据库的直接持久请求。目前没有延迟save()调用的概念。
通过向测试资源添加logback-test.xml
文件来启用查询日志记录:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d %5p %40.40c:%4L - %m%n</pattern>
</encoder>
</appender>
<logger name="org.neo4j.ogm" level="info" />
<root level="info">
<appender-ref ref="console" />
</root>
</configuration>
您可以看到每个Person.save()
实际上会发出3个请求:
-
2016-07-25 05:27:51,093 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row CREATE (n:`Car`) SET n=row.props RETURN row.nodeRef as nodeRef, ID(n) as nodeId with params {rows=[{nodeRef=-590487524, props={type=f27dc1bac12a480}}, {nodeRef=-1760792732, props={type=41ff5d3a69b4a5b4}}, {nodeRef=-637840556, props={type=3e7e77ca5e406a21}}]}
2016-07-25 05:27:54,117 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row CREATE (n:`Person`) SET n=row.props RETURN row.nodeRef as nodeRef, ID(n) as nodeId with params {rows=[{nodeRef=-1446435394, props={name=bafd7ad2721516f8}}]}
2016-07-25 05:27:54,178 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row MATCH (startNode) WHERE ID(startNode) = row.startNodeId MATCH (endNode) WHERE ID(endNode) = row.endNodeId MERGE (startNode)-[rel:`HAS`]->(endNode) RETURN row.relRef as relRefId, ID(rel) as relId with params {rows=[{startNodeId=3, relRef=-712176789, endNodeId=0}, {startNodeId=3, relRef=-821487247, endNodeId=1}, {startNodeId=3, relRef=-31523689, endNodeId=2}]}
如果相反,Person创建的语句将仅用作100个人的参数,而对于Car对象则相同,性能会更好。
截至目前,OGM中没有原生的开箱即用功能(已打开的问题:https://github.com/neo4j/neo4j-ogm/issues/208
但是,您可以按saving
集合而不是逐个批处理它们:
@Test
@DirtiesContext
public void speedTest2() {
SessionFactory sessionFactory = new SessionFactory("hello.neo.domain");
Session session = sessionFactory.openSession();
Random rand = new Random(10);
System.out.println("Before linking up with Neo4j...");
long start = System.currentTimeMillis();
long mark = start;
for (int j = 0; j < 10; j++) {
List<Person> batch = new ArrayList<>();
for (int i = 0; i < 100; i++) {
Person greg = new Person(rand);
batch.add(greg);
}
session.save(batch);
long now = System.currentTimeMillis();
System.out.format("%d : Time:%d\n", j, now - mark);
mark = now;
}
}
你可以看到结果的差异令人印象深刻:
Not initialzing DB.
Before linking up with Neo4j...
0 : Time:7318
1 : Time:1731
2 : Time:1555
3 : Time:1481
4 : Time:1237
5 : Time:1176
6 : Time:1101
7 : Time:1094
8 : Time:1114
9 : Time:1015
Not initialzing DB.
Before linking up with Neo4j...
0 : Time:494
1 : Time:272
2 : Time:230
3 : Time:442
4 : Time:320
5 : Time:247
6 : Time:284
7 : Time:288
8 : Time:366
9 : Time:222