导入关系的阶段出了什么问题?

时间:2015-03-16 14:42:11

标签: neo4j

我终于征服了导入节点的阶段。现在我想尝试导入关系。可能存在1B关系。

#!/bin/bash
cd /home/luning/neo4j-enterprise-2.2.0-RC01-unix/neo4j-enterprise-2.2.0-RC01/bin
users="/data/weibo/user-header.csv"
for i in /data/weibo/users/*
do
    users=$users,$i
done
edges=/data/weibo/edge-header.csv,/data/weibo/ego/000000_0
./neo4j-import --stacktrace --into ../data/weibo_bak.db --nodes:User $users --relationships:Follow $edges --delimiter TAB --quote \' --bad-tolerance 50000 --id-type STRING

但总是说节点丢失了。不可理解的是,通过导入两个试验的相同文件,它给了我不同的缺失节点。  1.第一次

   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
    at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
    at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:56)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
    at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)

2。第二次

source: /data/weibo/ego/000000_0:1844245
startNode: 3492922617
endNode: 1589699375
type: Follow
 refering to missing node 1589699375
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1844245
   startNode: 3492922617
   endNode: 1589699375
   type: Follow
 refering to missing node 1589699375
    at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
    at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1844245
   startNode: 3492922617
   endNode: 1589699375
   type: Follow
 refering to missing node 1589699375
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:59)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
    at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)

但对于这两个节点1587438071和2765561213,我可以确保它们在我的文件中。因为我能找到它们。

[luning@pinnacle data]$ grep 1587438071 /data/weibo/users/*
/data/weibo/users/000024_0:1587438071   琬童沛胜    浙江 杭州           http://tp4.sinaimg.cn/1587438071/50/40024579617/0   f   147 60  272     false       LV2 31  一举成名|   正常  80                      2014-02-17 04:17:38


[luning@pinnacle data]$ grep 1589699375 /data/weibo/users/*
/data/weibo/users/000010_0:1589699375   在行动Isabella 吉林          http://tp4.sinaimg.cn/1589699375/50/5633181098/0    女   297 438 4729    1981-01-17  false       LV7            2014-08-13 21:43:34                      2014-01-28 10:18:52

那么,有谁能弄清楚它会如何发生?

1 个答案:

答案 0 :(得分:1)

可能是您的节点输入文件包含的字段没有正确关闭它们的引号,这些字段会被其他行“吃掉”,实际上不会导入这些节点(如果字段的对齐方式会发生)这样结束,否则抛出异常)。或者,面对这些汉字,解析器可能出现问题。

您是否有机会与我(解析器的主要作者和导入工具)分享您的输入数据以进行调查?