Question

我正在尝试使用周期性迭代APOC在图形数据模型中执行更改。我使用LOAD CSV命令来解析文本数据并在neo4j 3.4.4中上传文章。

USING PERIODIC COMMIT 5000 LOAD CSV WITH HEADERS 
FROM 'file:///article.txt' as r FIELDTERMINATOR '\t' 
MATCH (a:Article {PMID: toInt(r.PMID)}) 
WITH a, toLower(r.ArticleTitle) as text 
WITH a, reduce(t=text, delim in [",",".","!","?",'"',":",";","'","(",")","[","]","{","}"] | replace(t,delim," ")) as text 
WITH a, reduce(t=text, delim in ["/", "\\"] | replace(t, delim, " ")) as text with a, filter(w in split(text, " ") where length(w) > 2) as words SET a.words = words;

我可以使用以下命令创建Word节点，该命令对要加载的数据量非常敏感。当前数据库有83,000条文章，查询在几分钟内运行良好。

MATCH (a:Article) where exists(a.words) 
WITH a  
FOREACH (word in a.words| 
  MERGE (w:Word {Name: word}) 
  MERGE (a) -[r:contains]-> (w) 
  ON CREATE SET r.f = 1 
  ON MATCH SET r.f = r.f + 1
)

因此，我尝试对较小批量的数据使用apoc.periodic.iterate过程。由于APOC过程不允许查询中的引号，因此我首先创建一个过滤后的单词数组，以便将其用于生成节点和关系。

CALL apoc.periodic.iterate('MATCH (a:Article) WHERE EXISTS(a.words) RETURN a as art','WITH {art} as a FOREACH (word in a.words | MERGE (w:Word {Name: word}) MERGE (a) -[r:contains]-> (w) ON CREATE SET r.f = 1 ON MATCH SET r.f = r.f + 1)', {batchSize:1000, parallel:true})

上述查询在某些批次上失败，并显示以下消息：

batches total   timeTaken   committedOperations failedOperations    failedBatches   retries errorMessages   batch   operations  wasTerminated
83  1233    2   573 82  82  0   
{

}
{
  "total": 83,
  "committed": 1,
  "failed": 82,
  "errors": {
    "java.lang.NullPointerException": 82
  }
}
{
  "total": 1233,
  "committed": 573,
  "failed": 82,
  "errors": {

  }
}
false

数据是公开的，但由于太大而无法在此处共享，因此我无法弄清楚哪个条目失败，此外，所有条目都适用于纯CYPHER查询。

周期性迭代APOC失败，并出现NULL Java指针异常

0 个答案: