我正在尝试批量导入两个节点CSV文件,如下所示:
neo4j-admin import --database graph.db --nodes:A A.csv --nodes:B B.csv --id-type INTEGER --ignore-missing-nodes
文件A.csv包含约140万个节点,文件B.csv包含约25万个节点 导入被卡在屏幕上的一半位置:
Available resources:
Total machine memory: 31.48 GB
Free machine memory: 19.78 GB
Max heap memory : 7.00 GB
Processors: 8
Configured max memory: 22.03 GB
High-IO: false
Import starting 2019-02-05 15:08:13.436+0200
Estimated number of nodes: 1.67 M
Estimated number of node properties: 10.25 M
Estimated number of relationships: 0.00
Estimated number of relationship properties: 0.00
Estimated disk space usage: 239.38 MB
Estimated required memory usage: 1.02 GB
InteractiveReporterInteractions command list (end with ENTER):
c: Print more detailed information about current stage
i: Print more detailed information
(1/4) Node import 2019-02-05 15:08:13.523+0200
Estimated number of nodes: 1.67 M
Estimated disk space usage: 239.38 MB
Estimated required memory usage: 1.02 GB
.......... .......... .......... .......... .......... 5% ∆2s 37ms
.......... .......... .......... .......... .......... 10% ∆2ms
.......... .......... .......... .......... .......... 15% ∆1s 53ms
.......... .......... .......... .......... .......... 20% ∆1ms
.......... .......... .......... .......... .......... 25% ∆1s 259ms
.......... .......... .......... .......... .......... 30% ∆401ms
.......... ....-..... .......... .......... .......... 35% ∆159ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆0ms
.......... .......... .......... .......... .......... 50% ∆601ms
.........c
******** DETAILS 2019-02-05 13:09:37.589+0000 ********
Prepare node index
[*SORT----------------------------------------------------------------------------------------]1.95M
Memory usage: 1.02 GB
Duration: 1m 19s 270ms
Done batches: 195
.......... .......... .......... .......... .......... 5% ∆1m 18s 677ms
.......... .......... .......... .......... .......... 10% ∆1ms
.......... .......... .......... .......... .......... 15% ∆1ms
.......... .......... .......... .......... .......... 20% ∆0ms
.......... .......... .......... .......... .......... 25% ∆1ms
.......... .......... .......... .......... .......... 30% ∆1ms
.......... .......... .......... .......... .......... 35% ∆0ms
.......... .......... .......... .......... .......... 40% ∆1ms
.......... .......... .......... .......... .......... 45% ∆1ms
.......... .......... .......... .......... .......... 50% ∆1ms
.........
它最终在14小时后成功完成!
环顾网路,看来这件事大约需要一分钟。
这是我要导入的文件的开头:
文件A:
FldA1:string,FldA2:string,FldA3:string,FldA4:long,FldA5:long,FldA6:long
REGULAR,10.68.15.224,10.68.15.255,172232672,172232703,8
REGULAR,0.0.0.0,10.93.20.15,0,173872143,180
文件B:
FldB1:ID(B),FldB2:string,FldB3:string,FldB4:string,FldB5:long,FldB6:long,FldB7:long
4,10.232.31.197,,,2,1,0
5,10.232.31.189,aa_99,david_33,3,2,0
我在具有32GB RAM和默认Neo4j conf的CentOS 6上使用Neo4j 3.5.2
我的导入怎么了?