neo4j加载csv - 某些部分不起作用

时间:2015-05-18 16:56:16

标签: neo4j cypher load-csv

我从csv导入时出现问题。

我在shell中运行以下内容,最后一部分(MERGE (e1)-[:NEXT]->(hit))))永远不会发生。 有点沮丧......

每个会话都有x次点击。 我想找到插入会话的最后一个命中,并将其与NEXT关系的新命中连接

PSV样本:

SESSION_ID | DATE_TIME XXX | 2015-01-01T01:00:00 XXX | 2015-02-02T09:00:00 YYY | 2015-03-03T06:00:44

代码:

USING PERIODIC COMMIT 100
 LOAD CSV WITH HEADERS FROM 'file:///home/xxx.csv' AS line FIELDTERMINATOR '|'


 MERGE (session :Session { session_id:line.session_id })
 MERGE (hit:Hit{date:line.date_time})

// ........更多合并......

//关系

CREATE (hit)-[:IN_SESSION]->(session) 
 CREATE ....//more relations

 WITH session

 MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
 WITH prev_hit ORDER BY prev_hit.date_time DESC LIMIT 2
 WITH collect(prev_hit) as entries

 FOREACH(i in RANGE(0, length(entries)-1) | 
   FOREACH(e1 in [entries[i]] | 
        MERGE (e1)-[:NEXT]->(hit)))

1 个答案:

答案 0 :(得分:3)

我没有看到你试图用嵌套的FOREACH循环实现什么。

如果你真的同时得到hit节点和session节点,那么简单的MERGE应该没问题。我认为你必须在hit声明中加入WITH

MERGE (session :Session { id: "xxx" })
MERGE (hit:Hit { date_time:"2015-04-03T06:00:44" })
CREATE (hit)-[:IN_SESSION]->(session)
WITH session, hit
MATCH (prev_hit:Hit)-[:IN_SESSION]->(session)
WHERE prev_hit <> hit // make sure that you only match other hits
WITH hit, prev_hit 
ORDER BY prev_hit.date_time DESC LIMIT 1
MERGE (prev_hit)-[:NEXT]->(hit) // create relationship between the two 

更新

我将查询更新为仅匹配当前命中的prev_hit。上面的查询可以按您的方式工作,也就是说它与同一个NEXT相关的单个Hit节点创建了一个Session关系。见这里:http://console.neo4j.org/?id=ov7mer

date_time可能存在问题。你把它存储为一个字符串我认为,排序可能并不总能给你预期的结果。

更新2

关于您的第二条评论:如果您逐行检查文件并添加Hit个节点,则只能添加已添加到Hit个节点的关系。如果您希望NEXT个节点之间存在Hit个连续的NEXT个关系链,则只有在确保CSV文件的条目按date_time升序排序时,才能在一个查询中执行此操作。

您可以稍后在Hit节点之间添加MATCH (s:Session)--(hit:Hit) // first order by hit.date_time WITH DISTINCT s, hit ORDER BY hit.date_time DESC // this will return one row per session with the hits in a collection WITH s, collect(hit) AS this_session_hits // try this to check the ordering: // RETURN s.session_id, this_session_hits // the following queries will be done on each row, this is like iterating over the sessions FOREACH(i in RANGE(0, length(this_session_hits)-2) | FOREACH(e1 in [this_session_hits[i]] | FOREACH(e2 in [this_session_hits[i+1]] | MERGE (e1)-[:NEXT]->(e2)))) 关系,如下所述:http://www.markhneedham.com/blog/2014/04/19/neo4j-cypher-creating-relationships-between-a-collection-of-nodes-invalid-input/

使用以下命令开始查询:

Hit

最终答案;)

此查询适用于neo4j控制台(http://console.neo4j.org/?id=mginka)中的数据集。它会将会话中的所有NEXTMATCH (s:Session)<--(hit:Hit) WITH DISTINCT s, hit ORDER BY hit.date_time ASC WITH s, collect(hit) AS this_session_hits FOREACH (i IN RANGE(0, length(this_session_hits)-2)| FOREACH (e1 IN [this_session_hits[i]]| FOREACH (e2 IN [this_session_hits[i+1]]| MERGE (e1)-[:NEXT]->(e2)))) 关系相关联。

    private TestContext testContext;
    public TestContext TestContext
    {
        get { return testContext; }
        set { testContext = value; }
    }
    [TestMethod]
    public void SaveEmpty_Json_LocalStorage()
    {
        // Testing JSON Type format export and save
         SetWindowsUsers();
        // Add Network Information
        SetWifiInformation();

        // More logic and assertions here.
        // More logic and assertions here.
        // More logic and assertions here.
    }

    [TestMethod]
    [DeploymentItem("input.xml")]
    [DataSource("Microsoft.VisualStudio.TestTools.DataSource.XML",
               "input.xml",
               "User",
                DataAccessMethod.Sequential)]
    public void SetWindowsUsers()
    {
      Console.WriteLine(TestContext.DataRow["UserName"].ToString())
      // MORE LOGIC and Asserts  
    }

    [TestMethod]
    [DeploymentItem("input.xml")]
    [DataSource("Microsoft.VisualStudio.TestTools.DataSource.XML",
               "input.xml",
               "WifiList",
                DataAccessMethod.Sequential)]
    public void SetWifiInformation()
    {
      Console.WriteLine(TestContext.DataRow["SSID"].ToString())
      // MORE LOGIC and Asserts  
    }