无法使用sourceID的null属性值合并节点。 (ICIJ巴拿马论文数据)

时间:2017-03-09 11:26:44

标签: neo4j

我正在尝试将csv文件导入Neo4j DB,这是CSV文件的链接 http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv

我用于将CSV数据导入数据库的查询是

USING PERIODIC COMMIT 10000
LOAD CSV FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:line[9]})

Query会产生以下错误: - 无法使用sourceID的null属性值合并节点

我发现了一些有用的资源但是考虑到ICIJ巴拿马文件大小的大小,执行需要几个小时,有没有办法消除对NULL值的检查并优化查询?

2 个答案:

答案 0 :(得分:2)

您使用的索引错误。索引从0开始,而不是1.在给定的csv中也有一个标题行。因此,相应地编辑Cypher查询。

将其更改为以下内容:

USING PERIODIC COMMIT 10000 
LOAD CSV WITH HEADERS FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line 
MERGE 
(i:Intermediateries{name:line[0],internal_id:line[1],address:line[2],valid_until:line[3],country_codes:line[4],countries:line[5],status:line[6],node_id:line[7],sourceID:line[8]})

有关详细信息,请参阅官方文档 - http://neo4j.com/docs/developer-manual/current/cypher/clauses/load-csv/#load-csv-import-data-from-a-csv-file

答案 1 :(得分:1)

是的,您可以轻松过滤出sourceID为空的行

USING PERIODIC COMMIT 10000 LOAD CSV FROM  "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line 
WITH line where line[9] is not null
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],
valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:line[9]})

如果要导入这些节点,也可以使用coalesce,即使它们没有源ID

USING PERIODIC COMMIT 10000 LOAD CSV FROM  "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line 
WITH line,coalesce(line[9],"NoId") as sourceID
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],
valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:sourceID})

如果我导入了apoc plugin

,我会创建这样的查询
USING PERIODIC COMMIT 10000 LOAD CSV WITH HEADERS FROM  "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line 
MERGE (i:Intermediateries{internal_id:line.internal_id})
ON CREATE SET i += apoc.map.clean(row.properties,['internal_id'],[])

如果您没有apoc插件,则必须手动指定属性

USING PERIODIC COMMIT 10000 LOAD CSV WITH HEADERS FROM  "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line 
MERGE (i:Intermediateries{internal_id:line.internal_id})
ON CREATE SET i.name = line.name,i.address = line.address, i.valid_until = line.valid_until,
i.country_codes = line.country_code, i.countries = line.countries,i.status = line.status,i.node_id = line.node_id,
i.sourceID = line.sourceID,i.note = line.note