我知道这个问题之前曾被问过几次,但答案并没有解决我的问题。我正在尝试执行此查询:
typings install dt~google.maps --global
但我收到错误:USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///C:/Users/Zona5/Documents/Neo4j/checkIntel/import/personaldata.csv' AS line1
MERGE (a:Address1 {address_name1:line1.address1})
。
其他人建议使用:
Cannot merge node using null property value for address_name1
但是如果节点具有多个属性,则此解决方案有效。就我而言,它只有USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
'file:///C:/Users/Zona5/Documents/Neo4j/checkIntel/import/personaldata.csv' AS line1
MERGE (a:Address1)
ON CREATE SET a.address_name1=line1.address1
ON MATCH SET a.address_name1=line1.address1
属性。
有没有办法解决这个问题,比如在address_name1
或其他解决方案之前用查询中的单词替换空值?
答案 0 :(得分:6)
如果没有地址,你真的需要创建Address
节点吗?
您可以使用WITH
/ WHERE
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///C:/Users/Zona5/Documents/Neo4j/checkIntel/import/personaldata.csv' AS line1
WITH line1
WHERE NOT line1.address1 IS NULL
MERGE (a:Address1 {address_name1:line1.address1})
否则,如果要创建表示“未知”地址的节点,可以使用coalesce()
替换默认值:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM 'file:///C:/Users/Zona5/Documents/Neo4j/checkIntel/import/personaldata.csv' AS line1
MERGE (a:Address1 {address_name1: coalesce(line1.address1, "Unknown")})
答案 1 :(得分:2)
您好:我发布了这个相当广泛的答案,因为我最近在尝试将这些数据加载到Neo4j(neo4j 3.3.4)时遇到了处理我的CSV文件中存在的NULL(丢失)值的惊人困难。
我提出三种解决方案。
我正在使用Cycli(cycli 0.7.6)CLI,通过pip安装在Arch Linux x86_64系统上的Python 3.5 venv中。
我的CSV文件(glycolysis_metabolites.csv)是:
name,abbreviation,kegg_entry
α-D-glucose,GLC,C00267
glucose 6-phosphate,G6P,C00668
fructose 6-phosphate,F6P,C05345
"fructose 1,6-bisphosphate",FBP,C05378
dihydroxyacetone phosphate,DHAP,C00111
D-glyceraldehyde 3-phosphate,,C00118
"1,3-bisphosphoglycerate","1,3-BPG",C00236
3-phosphoglycerate,3PG,C00197
2-phosphoglycerate,2PG,C00631
phosphoenolpyruvate,PEP,C00074
pyruvate,,C00022
通过psql / COPY ...命令从PostgreSQL表复制的那些数据有一个" UNIQUE NOT NULL"约束"名称"字段。
在调查Google等之后,我进行了三次实验,如下所示。实验2和3基本相同。
我认为实验2中显示的方法是最佳解决方案,因为COALESCE语句包含在MERGE语句中。
我得出这个结论的原因是实验2使用" local"变量,而不是返回"全球"变量(实验3),从而最大限度地减少了对重用变量名称的意外后果。
我按如下方式加载Cypher脚本:
cat glycolysis_script.cypher | cypher-shell -u victoria -p <your_password>
**实验1 **
参考:http://markhneedham.com/blog/2014/08/22/neo4j-load-csv-handling-empty-columns/
这个解决方案(Mark Needham&#39; s)非常聪明:它创建包含所有非NULL属性的节点,例如
<id>: 0 abbreviation: GLC kegg_entry: C00267 name: α-D-glucose
<id>: 10 kegg_entry: C00022 name: pyruvate
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/mnt/Vancouver/Programming/data/metabolism/pg2neo4j/glycolysis_metabolites.csv" AS row
MERGE (a:GlycolysisMetabolites {name: row.name})
FOREACH(ignoreMe IN CASE WHEN row.abbreviation <> "" THEN [1] ELSE [] END | SET a.abbreviation = row.abbreviation)
FOREACH(ignoreMe IN CASE WHEN row.kegg_entry <> "" THEN [1] ELSE [] END | SET a.kegg_entry = row.kegg_entry)
// With "USING PERIODIC COMMIT",
// RETURN a;
// throws this error: "Unknown value type: STRUCT"
// ... so, use this:
RETURN a.name, a.abbreviation, a.kegg_entry;
输出:
$ cat glycolysis.cypher | cypher-shell -u victoria -p <your_password>
a.name, a.abbreviation, a.kegg_entry
"α-D-glucose", "GLC", "C00267"
"glucose 6-phosphate", "G6P", "C00668"
"fructose 6-phosphate", "F6P", "C05345"
"fructose 1,6-bisphosphate", "FBP", "C05378"
"dihydroxyacetone phosphate", "DHAP", "C00111"
"D-glyceraldehyde 3-phosphate", NULL, "C00118"
"1,3-bisphosphoglycerate", "1,3-BPG", "C00236"
"3-phosphoglycerate", "3PG", "C00197"
"2-phosphoglycerate", "2PG", "C00631"
"phosphoenolpyruvate", "PEP", "C00074"
"pyruvate", NULL, "C00022"
但是,您无法在包含NULL值的属性上设置自己的MERGE规范(此处:&#34;缩写&#34;) - 原因是您无法在NULL属性值上进行合并。
使用:
MERGE (a:GlycolysisMetabolites {name: row.name})
失败(&#34;无法使用null属性值为缩写&#34合并节点):
MERGE (a:GlycolysisMetabolites {name: row.name, abbreviation:row.abbreviation})
MERGE (a:GlycolysisMetabolites {name: row.name, abbreviation:row.abbreviation, kegg_entry:row.kegg_entry})
实验2
参考:Neo4j use MERGE with null values
在这里,我设置一个空字符串(&#39;&#39;)作为CSV文件中存在的NULL值的替代;你可以用任何你想要的东西;例如:&#39; Undefined&#39;,&#39; null&#39;,...
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/mnt/Vancouver/Programming/data/metabolism/pg2neo4j/glycolysis_metabolites.csv" AS row
// MERGE (a:GlycolysisMetabolites {name: row.name})
MERGE (a:GlycolysisMetabolites {name: row.name, abbreviation:COALESCE(row.abbreviation, ''), kegg_entry:COALESCE(row.kegg_entry, '')})
// With "USING PERIODIC COMMIT",
// RETURN a;
// throws this error: "Unknown value type: STRUCT"
// ... so, use this:
RETURN a.name, a.abbreviation, a.kegg_entry;
输出:
$ cat glycolysis.cypher | cypher-shell -u victoria -p <your_password>
a.name, a.abbreviation, a.kegg_entry
"α-D-glucose", "GLC", "C00267"
"glucose 6-phosphate", "G6P", "C00668"
"fructose 6-phosphate", "F6P", "C05345"
"fructose 1,6-bisphosphate", "FBP", "C05378"
"dihydroxyacetone phosphate", "DHAP", "C00111"
"D-glyceraldehyde 3-phosphate", "", "C00118"
"1,3-bisphosphoglycerate", "1,3-BPG", "C00236"
"3-phosphoglycerate", "3PG", "C00197"
"2-phosphoglycerate", "2PG", "C00631"
"phosphoenolpyruvate", "PEP", "C00074"
"pyruvate", "", "C00022"
实验3
参考文献:
Neo4j use MERGE with null values
https://github.com/neo4j/neo4j/issues/2521
这也有效,但由于COALESCE语句在MERGE语句之外,我担心RETURN语句返回的数据可能会导致问题,如果这些变量名在其他地方重用。作为一种解决方法, 我添加了一个前缀(a_)作为准UID,但我认为上面的实验2中的解决方案是更好的方法。
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/mnt/Vancouver/Programming/data/metabolism/pg2neo4j/glycolysis_metabolites.csv" AS row
WITH
COALESCE(CASE row.name WHEN '' THEN null ELSE row.name END, '') AS a_name,
COALESCE(CASE row.abbreviation WHEN '' THEN null ELSE row.abbreviation END, '') AS a_abbreviation,
COALESCE(CASE row.kegg_entry WHEN '' THEN null ELSE row.kegg_entry END, '') AS a_kegg_entry
MERGE (a:GlycolysisMetabolites {name:a_name, abbreviation:a_abbreviation, kegg_entry:a_kegg_entry})
// Note: RETURN can only be used at the end of the query
RETURN a_name, a_abbreviation, a_kegg_entry;
输出:
$ cat glycolysis.cypher | cypher-shell -u victoria -p <your_password>
a_name, a_abbreviation, a_kegg_entry
"α-D-glucose", "GLC", "C00267"
"glucose 6-phosphate", "G6P", "C00668"
"fructose 6-phosphate", "F6P", "C05345"
"fructose 1,6-bisphosphate", "FBP", "C05378"
"dihydroxyacetone phosphate", "DHAP", "C00111"
"D-glyceraldehyde 3-phosphate", "", "C00118"
"1,3-bisphosphoglycerate", "1,3-BPG", "C00236"
"3-phosphoglycerate", "3PG", "C00197"
"2-phosphoglycerate", "2PG", "C00631"
"phosphoenolpyruvate", "PEP", "C00074"
"pyruvate", "", "C00022"
有关此主题/问题的其他StackOverflow讨论: https://stackoverflow.com/search?tab=votes&q=Neo4j%20use%20MERGE%20with%20null%20value
<强>附录强>
参考(例如):Neo4j CSV file load with empty cells
这&#34;工作&#34;,但SKIPS创建一个节点,如果任何字段包含NULL值:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/mnt/Vancouver/Programming/data/metabolism/pg2neo4j/glycolysis_metabolites.csv" AS row
FOREACH (
x IN CASE WHEN row.abbreviation IS NULL OR row.kegg_entry IS NULL THEN [] ELSE [1] END |
MERGE (a:GlycolysisMetabolites {name: row.name, abbreviation: row.abbreviation, kegg_entry: row.kegg_entry})
)
RETURN row.name, row.abbreviation, row.kegg_entry;
输出:
$ cat glycolysis.cypher | cypher-shell -u victoria -p <password>
row.name, row.abbreviation, row.kegg_entry
"α-D-glucose", "GLC", "C00267"
"glucose 6-phosphate", "G6P", "C00668"
"fructose 6-phosphate", "F6P", "C05345"
"fructose 1,6-bisphosphate", "FBP", "C05378"
"dihydroxyacetone phosphate", "DHAP", "C00111"
"D-glyceraldehyde 3-phosphate", NULL, "C00118"
"1,3-bisphosphoglycerate", "1,3-BPG", "C00236"
"3-phosphoglycerate", "3PG", "C00197"
"2-phosphoglycerate", "2PG", "C00631"
"phosphoenolpyruvate", "PEP", "C00074"
"pyruvate", NULL, "C00022"
请注意,在Neo4j浏览器中,只创建了9个节点(不是11个节点):节点用于&#34; D-甘油醛3-磷酸酯&#34;和&#34;丙酮酸&#34;没有创建。