我正在尝试将csv文件导入Neo4j DB,这是CSV文件的链接 http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv
我用于将CSV数据导入数据库的查询是
USING PERIODIC COMMIT 10000
LOAD CSV FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:line[9]})
Query会产生以下错误: - 无法使用sourceID的null属性值合并节点
我发现了一些有用的资源但是考虑到ICIJ巴拿马文件大小的大小,执行需要几个小时,有没有办法消除对NULL值的检查并优化查询?
答案 0 :(得分:2)
您使用的索引错误。索引从0开始,而不是1.在给定的csv中也有一个标题行。因此,相应地编辑Cypher查询。
将其更改为以下内容:
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
MERGE
(i:Intermediateries{name:line[0],internal_id:line[1],address:line[2],valid_until:line[3],country_codes:line[4],countries:line[5],status:line[6],node_id:line[7],sourceID:line[8]})
有关详细信息,请参阅官方文档 - http://neo4j.com/docs/developer-manual/current/cypher/clauses/load-csv/#load-csv-import-data-from-a-csv-file
答案 1 :(得分:1)
是的,您可以轻松过滤出sourceID为空的行
USING PERIODIC COMMIT 10000 LOAD CSV FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
WITH line where line[9] is not null
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],
valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:line[9]})
如果要导入这些节点,也可以使用coalesce,即使它们没有源ID
USING PERIODIC COMMIT 10000 LOAD CSV FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
WITH line,coalesce(line[9],"NoId") as sourceID
MERGE (i:Intermediateries{name:line[1],internal_id:line[2],address:line[3],
valid_until:line[4],country_codes:line[5],countries:line[6],status:line[7],node_id:line[8],sourceID:sourceID})
如果我导入了apoc plugin
,我会创建这样的查询USING PERIODIC COMMIT 10000 LOAD CSV WITH HEADERS FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
MERGE (i:Intermediateries{internal_id:line.internal_id})
ON CREATE SET i += apoc.map.clean(row.properties,['internal_id'],[])
如果您没有apoc插件,则必须手动指定属性
USING PERIODIC COMMIT 10000 LOAD CSV WITH HEADERS FROM "http://apps.dealopia.com/offshoreleaks/offshore_leaks_csvs-20170104/Intermediaries.csv" AS line
MERGE (i:Intermediateries{internal_id:line.internal_id})
ON CREATE SET i.name = line.name,i.address = line.address, i.valid_until = line.valid_until,
i.country_codes = line.country_code, i.countries = line.countries,i.status = line.status,i.node_id = line.node_id,
i.sourceID = line.sourceID,i.note = line.note