我试图将关系数据库中的数据导入neo4j。这个过程就像这样(简化了一点):
while (!sarBatchService.isFinished()) {
logger.info("New batch started");
Date loadDeklFrom = sarBatchService.getStartDateForNewBatch();
Date loadDeklTo = sarBatchService
.getEndDateForNewBatch(loadDeklFrom); // = loadDeklFrom + 2 hours
logger.info("Dates calculated");
Date startTime = new Date();
List<Dek> deks = dekLoadManager
.loadAllDeks(loadDeklFrom, loadDeklTo); // loading data from the relational database (POINT A)
Date endLoadTime = new Date();
logger.info("Deks loaded");
GraphDatabase gdb = template.getGraphDatabase();
Transaction tx = gdb.beginTx();
logger.info("Transaction started!");
try {
for (Deks dek : deks) {
/* transform dek into nodes, and save
this nodes with Neo4jTemplate.save() */
}
logger.info("Deks saved");
Date endImportTime = new Date();
int aff = sarBatchService.insertBatchData(loadDeklFrom,
loadDeklTo, startTime, endLoadTime, endImportTime,
deks.size()); // (POINT B)
if (aff != 1) {
String msg = "Something went wrong",
throw new RuntimeException(msg);
}
logger.info("Batch data saved into relational database");
tx.success();
logger.info("Transaction marked as success.");
} catch (NoSuchFieldException | SecurityException
| IllegalArgumentException | IllegalAccessException
| NoSuchMethodException | InstantiationException
| InvocationTargetException e1) {
logger.error("Something bad happend :(");
logger.error(e1.getStackTrace().toString());
} finally {
logger.info("Closing transaction...");
tx.close(); // (POINT C)
logger.info("Transaction closed!");
logger.info("Need more work? " + !sarBatchService.isFinished());
}
}
因此,关系数据库中的数据有一个时间戳,表明它的存储时间,并且我在两小时内按两小时的时间间隔加载它( POINT A 码)。之后,我迭代加载的数据,将其转换为节点(spring-data-neo4j节点),存储在neo4j中,并在关系中存储有关当前批次( POINT B )的信息。数据库。我几乎记录了每一步,以便更容易地进行调试。
该计划成功完成158批次。问题从第159批开始时开始。程序停在代码中的 POINT C (tx.close())并等待4个小时(通常持续几秒钟)。之后继续正常。
我尝试在tomcat 7上运行它,堆大小为10GB,堆大小为4GB。结果是相同的(第159批的块)。一个事务中的最大节点数在10k到15k之间(平均为7k),第159个节点的节点数少于10k。
有趣的是,如果数据按4小时或12小时12小时加载,一切顺利。此外,如果我重新启动Tomcat或仅执行第159批,一切都没有问题。
我使用spring 3.2.8和spring-data-neo4j 3.0.2。
这是neo4j的message.log:
...
2014-11-24 15:21:38.080+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 418ms [total block time: 150.973s]
2014-11-24 15:21:45.722+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 377ms [total block time: 151.35s]
...
2014-11-24 15:23:57.381+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 392ms [total block time: 156.593s]
2014-11-24 15:24:06.758+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotating [/home/pravila/data/neo4j/nioneo_logical.log.1] @ version=22 to /home/pravila/data/neo4j/nioneo_logical.log.2 from position 26214444
2014-11-24 15:24:06.763+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=24149878 out of [339=Start[339,xid=GlobalId[NEOKERNL|5889317606667601380|364|-1], BranchId[ 52 49 52 49 52 49 ],master=-1,me=-1,time=2014-11-24 15:23:13.021+0000/1416842593021,lastCommittedTxWhenTransactionStarted=267]]
2014-11-24 15:24:07.401+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=2064582
2014-11-24 15:24:07.402+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=2064582, version 23 and last tx 267
2014-11-24 15:24:07.684+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotating [/home/pravila/data/neo4j/index/lucene.log.1] @ version=6 to /home/pravila/data/neo4j/index/lucene.log.2 from position 26214408
2014-11-24 15:24:07.772+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=25902494 out of [134=Start[134,xid=GlobalId[NEOKERNL|5889317606667601380|364|-1], BranchId[ 49 54 50 51 55 52 ],master=-1,me=-1,time=2014-11-24 15:23:13.023+0000/1416842593023,lastCommittedTxWhenTransactionStarted=133]]
2014-11-24 15:24:07.871+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=311930
2014-11-24 15:24:07.878+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=311930, version 7 and last tx 133
2014-11-24 15:24:10.919+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 214ms [total block time: 156.807s]
2014-11-24 15:24:17.486+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 405ms [total block time: 157.212s]
...
2014-11-24 15:25:28.692+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 195ms [total block time: 159.316s]
2014-11-24 15:25:33.238+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotating [/home/pravila/data/neo4j/nioneo_logical.log.2] @ version=23 to /home/pravila/data/neo4j/nioneo_logical.log.1 from position 26214459
2014-11-24 15:25:33.242+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate log first start entry @ pos=24835943 out of [349=Start[349,xid=GlobalId[NEOKERNL|-6436474643536791121|374|-1], BranchId[ 52 49 52 49 52 49 ],master=-1,me=-1,time=2014-11-24 15:25:27.038+0000/1416842727038,lastCommittedTxWhenTransactionStarted=277]]
2014-11-24 15:25:33.761+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Rotate: old log scanned, newLog @ pos=1378532
2014-11-24 15:25:33.763+0000 INFO [o.n.k.i.t.x.XaLogicalLog]: Log rotated, newLog @ pos=1378532, version 24 and last tx 277
2014-11-24 15:25:37.031+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 148ms [total block time: 159.464s]
2014-11-24 15:25:45.891+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 153ms [total block time: 159.617s]
....
2014-11-24 15:26:48.447+0000 WARN [o.n.k.EmbeddedGraphDatabase]: GC Monitor: Application threads blocked for an additional 221ms [total block time: 161.641s]
我不知道这里发生了什么......
请帮忙。
答案 0 :(得分:0)
看起来你在那里有一个泄密的外部交易。
这样您显示的内部事务实际上完成但外部事务继续累积状态。由于Neo不会暂停外部交易,但纯粹将它们嵌套,因此在你触及外部tx.success(); tx.close();
如果你在阻塞时进行线程转储以查看它是否实际上停留在提交中,你应该看到它。
答案 1 :(得分:0)
经过几个小时的搜索和测试后,我尝试以4到4小时的时间间隔重新运行整批产品。它也在第145批(交易)后停止。不同之处在于它抛出了一个错误(打开的文件太多)。我将打开文件的ulimit设置为无限制,现在它可以正常工作。唯一的谜团是为什么程序第一次没有抛出错误。