Question

我正在尝试批量导入两个节点CSV文件，如下所示：

neo4j-admin import --database graph.db --nodes:A A.csv --nodes:B B.csv --id-type INTEGER --ignore-missing-nodes

文件A.csv包含约140万个节点，文件B.csv包含约25万个节点导入被卡在屏幕上的一半位置：

Available resources:
  Total machine memory: 31.48 GB
  Free machine memory: 19.78 GB
  Max heap memory : 7.00 GB
  Processors: 8
  Configured max memory: 22.03 GB
  High-IO: false

Import starting 2019-02-05 15:08:13.436+0200
  Estimated number of nodes: 1.67 M
  Estimated number of node properties: 10.25 M
  Estimated number of relationships: 0.00
  Estimated number of relationship properties: 0.00
  Estimated disk space usage: 239.38 MB
  Estimated required memory usage: 1.02 GB

InteractiveReporterInteractions command list (end with ENTER):
  c: Print more detailed information about current stage
  i: Print more detailed information

(1/4) Node import 2019-02-05 15:08:13.523+0200
  Estimated number of nodes: 1.67 M
  Estimated disk space usage: 239.38 MB
  Estimated required memory usage: 1.02 GB
.......... .......... .......... .......... ..........   5% ∆2s 37ms
.......... .......... .......... .......... ..........  10% ∆2ms
.......... .......... .......... .......... ..........  15% ∆1s 53ms
.......... .......... .......... .......... ..........  20% ∆1ms
.......... .......... .......... .......... ..........  25% ∆1s 259ms
.......... .......... .......... .......... ..........  30% ∆401ms
.......... ....-..... .......... .......... ..........  35% ∆159ms
.......... .......... .......... .......... ..........  40% ∆1ms
.......... .......... .......... .......... ..........  45% ∆0ms
.......... .......... .......... .......... ..........  50% ∆601ms
.........c


        ******** DETAILS 2019-02-05 13:09:37.589+0000 ********

        Prepare node index
        [*SORT----------------------------------------------------------------------------------------]1.95M
        Memory usage: 1.02 GB
        Duration: 1m 19s 270ms
        Done batches: 195

.......... .......... .......... .......... ..........   5% ∆1m 18s 677ms
.......... .......... .......... .......... ..........  10% ∆1ms
.......... .......... .......... .......... ..........  15% ∆1ms
.......... .......... .......... .......... ..........  20% ∆0ms
.......... .......... .......... .......... ..........  25% ∆1ms
.......... .......... .......... .......... ..........  30% ∆1ms
.......... .......... .......... .......... ..........  35% ∆0ms
.......... .......... .......... .......... ..........  40% ∆1ms
.......... .......... .......... .......... ..........  45% ∆1ms
.......... .......... .......... .......... ..........  50% ∆1ms
.........

它最终在14小时后成功完成！

环顾网路，看来这件事大约需要一分钟。

这是我要导入的文件的开头：

文件A：

FldA1:string,FldA2:string,FldA3:string,FldA4:long,FldA5:long,FldA6:long
REGULAR,10.68.15.224,10.68.15.255,172232672,172232703,8
REGULAR,0.0.0.0,10.93.20.15,0,173872143,180

文件B：

FldB1:ID(B),FldB2:string,FldB3:string,FldB4:string,FldB5:long,FldB6:long,FldB7:long
4,10.232.31.197,,,2,1,0
5,10.232.31.189,aa_99,david_33,3,2,0

我在具有32GB RAM和默认Neo4j conf的CentOS 6上使用Neo4j 3.5.2

我的导入怎么了？

Neo4j CSV批量导入需要很长时间

0 个答案: