这是正常的查询性能吗?

时间:2015-01-28 03:08:05

标签: performance neo4j cypher

我有下一个图表模型:

(:PaveView {Number:int, Page:string}), (:Page {Name:string})
(:PageView)-[:At]->(:Page)
(:PageView)-[:Next]->(:PageView)

架构:

Indexes
  ON :Page(Name)            ONLINE (for uniqueness constraint)
  ON :PageView(Page)        ONLINE
  ON :PageView(Revision)    ONLINE (for uniqueness constraint)

Constraints
  ON (pageview:PageView) ASSERT pageview.Number IS UNIQUE
  ON (page:Page) ASSERT page.Name IS UNIQUE

我想做类似于post

的事情

我试图找到这种结构的没有循环的流行路径:

(:PageView)-[:Next*2]->(:PageView)

我的尝试:
1。来自post

的Nicole White的方法
MATCH p = (:PageView)-[:Next*2]->(:PageView)
WITH p, EXTRACT(v IN NODES(p) | v.Page) AS pages 
UNWIND pages AS views 
WITH p, COUNT(DISTINCT views) AS distinct_views 
WHERE distinct_views = LENGTH(NODES(p)) 
RETURN EXTRACT(v in NODES(p) | v.Page), count(p)
ORDER BY count(p) DESC
LIMIT 10;

个人资料输出:

10 rows
177270 ms

Compiler CYPHER 2.2-rule

ColumnFilter(0)
  |
  +Extract(0)
    |
    +ColumnFilter(1)
      |
      +Top
        |
        +EagerAggregation(0)
          |
          +Extract(1)
            |
            +ColumnFilter(2)
              |
              +Filter(0)
                |
                +Extract(2)
                  |
                  +ColumnFilter(3)
                    |
                    +EagerAggregation(1)
                      |
                      +UNWIND
                        |
                        +ColumnFilter(4)
                          |
                          +Extract(3)
                            |
                            +ExtractPath
                              |
                              +Filter(1)
                                |
                                +TraversalMatcher

+---------------------+---------+----------+------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
|            Operator |    Rows |   DbHits |                                                            Identifiers |                                                                                          Other |
+---------------------+---------+----------+------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+
|     ColumnFilter(0) |      10 |        0 |                              EXTRACT(v in NODES(p) | v.Page), count(p) |                                         keep columns EXTRACT(v in NODES(p) | v.Page), count(p) |
|          Extract(0) |      10 |        0 |    FRESHID225,   FRESHID258, EXTRACT(v in NODES(p) | v.Page), count(p) |                                                      EXTRACT(v in NODES(p) | v.Page), count(p) |
|     ColumnFilter(1) |      10 |        0 |                                               FRESHID225,   FRESHID258 |                                                                                keep columns ,  |
|                 Top |      10 |        0 |   FRESHID225,   INTERNAL_AGGREGATEf7fa022b-cdb5-4ef2-bec5-a2f4f10706b6 | {  AUTOINT0}; Cached(  INTERNAL_AGGREGATEf7fa022b-cdb5-4ef2-bec5-a2f4f10706b6 of type Integer) |
| EagerAggregation(0) |  212828 |        0 |   FRESHID225,   INTERNAL_AGGREGATEf7fa022b-cdb5-4ef2-bec5-a2f4f10706b6 |                                                                                                |
|          Extract(1) | 1749120 | 10494720 |                                          FRESHID225, distinct_views, p |                                                                                                |
|     ColumnFilter(2) | 1749120 |        0 |                                                      distinct_views, p |                                                                 keep columns distinct_views, p |
|           Filter(0) | 1749120 |        0 |                                          FRESHID196, distinct_views, p |                                                                    CoercedPredicate(anon[196]) |
|          Extract(2) | 2115766 |        0 |                                          FRESHID196, distinct_views, p |                                                                                                |
|     ColumnFilter(3) | 2115766 |        0 |                                                      distinct_views, p |                                                                 keep columns p, distinct_views |
| EagerAggregation(1) | 2115766 |        0 |              INTERNAL_AGGREGATEb0939c81-a40c-4012-afd6-4852b17cf2e4, p |                                                                                              p |
|              UNWIND | 6347298 |        0 |                                                        p, pages, views |                                                                                                |
|     ColumnFilter(4) | 2115766 |        0 |                                                               p, pages |                                                                          keep columns p, pages |
|          Extract(3) | 2115766 | 12694596 |                                                               p, pages |                                                                                          pages |
|         ExtractPath | 2115766 |        0 |                                                                      p |                                                                                                |
|           Filter(1) | 2115766 |  2115766 |                                                                        |                                                                 hasLabel(anon[34]:PageView(0)) |
|    TraversalMatcher | 2115766 | 16926150 |                                                                        |                                                                                         , , ,  |
+---------------------+---------+----------+------------------------------------------------------------------------+------------------------------------------------------------------------------------------------+

Total database accesses: 42231232


2。

match (p1:PageView)-[:Next]->(p2:PageView)-[:Next]->(p3:PageView)
where p1.Page<>p2.Page and p1.Page<>p3.Page and p2.Page<>p3.Page
RETURN [p1.Page,p2.Page,p3.Page], count(*) as count
ORDER BY count DESC
LIMIT 10;

个人资料输出:

10 rows
28660 ms

Compiler CYPHER 2.2-cost

Projection(0)
  |
  +Top
    |
    +EagerAggregation
      |
      +Projection(1)
        |
        +Filter(0)
          |
          +Expand(0)
            |
            +Filter(1)
              |
              +Expand(1)
                |
                +NodeByLabelScan

+------------------+---------------+---------+----------+------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|         Operator | EstimatedRows |    Rows |   DbHits |                                    Identifiers |                                                                                                                                                                    Other |
+------------------+---------------+---------+----------+------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|    Projection(0) |          1241 |      10 |        0 |   FRESHID146, [p1.Page,p2.Page,p3.Page], count |                                                                                                                                         [p1.Page,p2.Page,p3.Page], count |
|              Top |          1241 |      10 |        0 |                              FRESHID146, count |                                                                                                                                                      {  AUTOINT0}; count |
| EagerAggregation |          1241 |  212828 |        0 |                              FRESHID146, count |                                                                                                                                                                          |
|    Projection(1) |       1542393 | 1749120 | 10494720 |                         FRESHID146, p1, p2, p3 |                                                                                                                                                                          |
|        Filter(0) |       1542393 | 1749120 | 17872173 |                                     p1, p2, p3 | (((hasLabel(p3:PageView(0)) AND NOT(Property(p1,Page(3)) == Property(p3,Page(3)))) AND NOT(anon[20] == anon[43])) AND NOT(Property(p2,Page(3)) == Property(p3,Page(3)))) |
|        Expand(0) |       1904189 | 1985797 |  3971596 |                                     p1, p2, p3 |                                                                                                                                                       (p2)-[:Next]->(p3) |
|        Filter(1) |       1904191 | 1985799 | 10578840 |                                         p1, p2 |                                                                                         (NOT(Property(p1,Page(3)) == Property(p2,Page(3))) AND hasLabel(p2:PageView(0))) |
|        Expand(1) |       2115767 | 2115768 |  4231538 |                                         p1, p2 |                                                                                                                                                       (p1)-[:Next]->(p2) |
|  NodeByLabelScan |       2115770 | 2115770 |  2115771 |                                             p1 |                                                                                                                                                                :PageView |
+------------------+---------------+---------+----------+------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3。 (有循环!?我不知道为什么!我建议如果标识符不同,那么节点就不同了)

match (pv1:PageView)-[:Next]->(pv2:PageView)-[:Next]->(pv3:PageView),
(pv1)-[:At]->(p1),(pv2)-[:At]->(p2),(pv3)-[:At]->(p3)
RETURN [p1.Name,p2.Name,p3.Name], count(*) as count
ORDER BY count DESC
LIMIT 10;

个人资料输出:

10 rows
27678 ms

Compiler CYPHER 2.2-cost

Projection(0)
  |
  +Top
    |
    +EagerAggregation
      |
      +Projection(1)
        |
        +Filter(0)
          |
          +Expand(0)
            |
            +Filter(1)
              |
              +Expand(1)
                |
                +Filter(2)
                  |
                  +Expand(2)
                    |
                    +Filter(3)
                      |
                      +Expand(3)
                        |
                        +Expand(4)
                          |
                          +NodeByLabelScan

+------------------+---------------+---------+----------+------------------------------------------------+------------------------------------------------------------+
|         Operator | EstimatedRows |    Rows |   DbHits |                                    Identifiers |                                                      Other |
+------------------+---------------+---------+----------+------------------------------------------------+------------------------------------------------------------+
|    Projection(0) |          1454 |      10 |        0 |   FRESHID139, [p1.Name,p2.Name,p3.Name], count |                           [p1.Name,p2.Name,p3.Name], count |
|              Top |          1454 |      10 |        0 |                              FRESHID139, count |                                        {  AUTOINT0}; count |
| EagerAggregation |          1454 |  223557 |        0 |                              FRESHID139, count |                                                            |
|    Projection(1) |       2115760 | 2115764 | 12694584 |          FRESHID139, p1, p2, p3, pv1, pv2, pv3 |                                                            |
|        Filter(0) |       2115760 | 2115764 |        0 |                      p1, p2, p3, pv1, pv2, pv3 | (NOT(anon[116] == anon[80]) AND NOT(anon[80] == anon[98])) |
|        Expand(0) |       2115760 | 2115764 |  4231530 |                      p1, p2, p3, pv1, pv2, pv3 |                                          (pv1)-[:At]->(p1) |
|        Filter(1) |       2115762 | 2115766 |  2115766 |                          p2, p3, pv1, pv2, pv3 |  (hasLabel(pv1:PageView(0)) AND NOT(anon[21] == anon[45])) |
|        Expand(1) |       2115762 | 2115766 |  4231532 |                          p2, p3, pv1, pv2, pv3 |                                       (pv2)<-[:Next]-(pv1) |
|        Filter(2) |       2115764 | 2115766 |        0 |                               p2, p3, pv2, pv3 |                                 NOT(anon[116] == anon[98]) |
|        Expand(2) |       2115764 | 2115766 |  4231534 |                               p2, p3, pv2, pv3 |                                          (pv2)-[:At]->(p2) |
|        Filter(3) |       2115766 | 2115768 |  2115768 |                                   p3, pv2, pv3 |                                  hasLabel(pv2:PageView(0)) |
|        Expand(3) |       2115765 | 2115768 |  4231536 |                                   p3, pv2, pv3 |                                       (pv3)<-[:Next]-(pv2) |
|        Expand(4) |       2115767 | 2115768 |  4231538 |                                        p3, pv3 |                                          (pv3)-[:At]->(p3) |
|  NodeByLabelScan |       2115770 | 2115770 |  2115771 |                                            pv3 |                                                  :PageView |
+------------------+---------------+---------+----------+------------------------------------------------+------------------------------------------------------------+

系统信息:
窗口8.1
250G ssd
neo4j enterprise 2.2.0-M02
缓存:hpc
ram:8G
jvm堆大小:4G
记忆映射:50%
149(:Page)节点
2115770(:PageView)节点

为什么即使这三种方法中最快的也是如此之慢? (我想我的所有数据都在RAM中)
用循环过滤路径的最佳方法是什么?

1 个答案:

答案 0 :(得分:0)

通过为所有标识符指定标签,可以强制Cypher打开节点标题并过滤其中的所有标签。

这是您的关系名称很重要的地方。建立关系是为了引导您进入图形,为了提高性能,不需要指定标签,因此如果您确定路径中的所有节点都有Pageview标签,则只需省略它,除了查询的开头:

match (p1:PageView)-[:Next]->(p2)-[:Next]->(p3)
where p1.Page<>p2.Page and p1.Page<>p3.Page and p2.Page<>p3.Page
RETURN [p1.Page,p2.Page,p3.Page], count(*) as count
ORDER BY count DESC
LIMIT 10;

我在与您的问题相关的答案中发布了一些查询计划结果:Neo4j: label vs. indexed property?