限制和排序收集的节点按其总的唯一出现次数排序

时间:2019-01-24 20:21:17

标签: neo4j cypher

我已经和neo4j一起玩了几个星期。太神奇了,我正在慢慢掌握它,现在我有了一个合适的用例。

我要在几周后参加园艺考试。我需要通过植物的特征或用途来了解相当数量的植物,而图表的目的是找到要学习的植物最少(某些植物适用于多个问题)

在我的图中,这些特征是(特征)节点

考试有30个(问题)。

每个(问题)与(功能集)都有[FIND]->关系,其中{number}表示需要多少个不同植物的示例。

(功能集)[包含]->一个或多个(功能)-诸如“常绿”,“树”,“灌木”之类的东西。 “鳞茎”,“冬季开花”,“庭院植物”

(例如,命名10个灌木,5个常绿灌木,5个冬季盆栽植物)

我为数据库中植物具有的每个特征创建一个(植物)-[特征]->(特征)关系。

我可以运行一个查询,向我显示与问题中指定的FeatureSet中的所有功能匹配的所有植物

//SIMPLE RESULTS
MATCH (q:Question)-[find:FIND]->(fs:FeatureSet)

WITH fs, q, find.number as qNum
//GET ALL THE FEATURE SETS
MATCH (fInSet:Feature)<-[inc:INCLUDES]-(fs)
WITH fs, q , qNum, COLLECT(fInSet.feature) as fInSetList
MATCH (p:Plant)-[:FEATURES]->(fOfPlant:Feature)
WHERE fOfPlant.feature IN fInSetList
//Only plants matching all features
WITH fs, q , qNum, p, fInSetList, size(fInSetList) as inputCnt, count(DISTINCT fOfPlant) as cnt
WHERE cnt = inputCnt

RETURN
q.id,
'Give ' + qNum + ' examples of ' + q.q as Question,
fInSetList as plantFeaturesReqd,
COLLECT(
p.name
)
as plantsFound
ORDER BY q.id
q.id    Question    plantFeaturesReqd   plantsFound
"1" "Give 1 examples of Evergreen Shrubs"   ["evergreen", "shrub"]  ["Rhododendron"]
"2" "Give 3 examples of Shrubs" ["shrub"]   ["Cornus", "Wych Hazel", "Buddleja", "Rhododendron"]
"3" "Give 1 examples of Deciduous Shrubs"   ["deciduous", "shrub"]  ["Cornus", "Wych Hazel"]

我可以更改RETURN语句以查找植物在总结果集中出现的次数,从而在决定是否学习植物时为其赋予权重:

RETURN
DISTINCT p.name as dName,
count(p.name) as dCount
ORDER BY dCount DESC
"Wych Hazel"    2
"Cornus"    2
"Rhododendron"  2
"Buddleja"  1

从结果中我可以看出,不学习Buddleja就可以脱身。

我想实现以下目标:

q.id    Question    plantFeaturesReqd   plantsFound
"1" "Give 1 examples of Evergreen Shrubs"   ["evergreen", "shrub"]  [{p:"Rhododendron", w:2}]
"2" "Give 3 examples of Shrubs" ["shrub"]   [{p:"Cornus",w:2}, {p:"Wych Hazel",w:2}, {p:"Rhododendron", w:2}, {p:"Buddleja",w:1}]
"3" "Give 1 examples of Deciduous Shrubs"   ["deciduous", "shrub"]  [{p:"Cornus",w:2}, {p:"Wych Hazel",w:2}]

,然后为每个问题获取正确的[FIND {number}]植物,并按权重{w}进行排序。

我可以为q3(1个落叶灌木)学习到多个重量相同的选项,但是我只需要了解其落叶特性,因为我已经知道它是q2(2个灌木)的灌木。

类似地,如果我只需要学习2种灌木,那将是在拒绝相同重量的山茱W或Wych Hazel之间的一次折腾。

因此,我想(在稍后阶段)出于其他原因(例如,我可以更轻松地记住拉丁文的名字,或者我喜欢它!)来调整此权重。合并这些首选项还必须避免“破坏”其他问题的选择。

显然有很多问题(功能集)和成千上万种植物选择在实践中更有意义。

SVG of graph(对不起,我无法在此处上传SVG,并且png效果不佳)

CREATE.cypher

CREATE 
  (`0` :Feature {feature:'deciduous'}) ,
  (`1` :FeatureSet ) ,
  (`2` :Feature {feature:'shrub'}) ,
  (`3` :Feature {feature:'evergreen'}) ,
  (`5` :FeatureSet ) ,
  (`9` :Plant {name:'Rhododendron'}) ,
  (`13` :Question {id:'1',q:'Evergreen Shrubs'}) ,
  (`14` :Feature {feature:'semi-evergreen'}) ,
  (`15` :Plant {name:'Buddleja'}) ,
  (`16` :Question {id:'2',q:'Shrubs'}) ,
  (`17` :FeatureSet ) ,
  (`18` :Question {id:'3',q:'Deciduous Shrubs'}) ,
  (`19` :Plant {name:'Cornus'}) ,
  (`29` :Feature {feature:'winter-interest-flowers'}) ,
  (`30` :Plant {name:'Wych Hazel'}) ,
  (`1`)-[:`INCLUDES` ]->(`0`),
  (`1`)-[:`INCLUDES` ]->(`2`),
  (`5`)-[:`INCLUDES` ]->(`3`),
  (`5`)-[:`INCLUDES` ]->(`2`),
  (`9`)-[:`FEATURES` ]->(`2`),
  (`9`)-[:`FEATURES` ]->(`3`),
  (`13`)-[:`FIND` {number:1}]->(`5`),
  (`15`)-[:`FEATURES` ]->(`14`),
  (`15`)-[:`FEATURES` ]->(`2`),
  (`17`)-[:`INCLUDES` ]->(`2`),
  (`16`)-[:`FIND` {number:3}]->(`17`),
  (`18`)-[:`FIND` {number:1}]->(`1`),
  (`19`)-[:`FEATURES` ]->(`0`),
  (`19`)-[:`FEATURES` ]->(`2`),
  (`30`)-[:`FEATURES` ]->(`29`),
  (`30`)-[:`FEATURES` ]->(`0`),
  (`30`)-[:`FEATURES` ]->(`2`)

ARROWS.html

<ul class="graph-diagram-markup" data-internal-scale="1" data-external-scale="1">
  <li class="node" data-node-id="0" data-x="-162.05996704101562" data-y="-608.8465919494629">
    <span class="caption">Feature</span><dl class="properties"><dt>feature</dt><dd>deciduous</dd></dl></li>
  <li class="node" data-node-id="1" data-x="-547.4232482910156" data-y="75.07541275024414">
    <span class="caption">FeatureSet</span>
  </li>
  <li class="node" data-node-id="2" data-x="-1049.3427429199219" data-y="-608.8465919494629">
    <span class="caption">Feature</span><dl class="properties"><dt>feature</dt><dd>shrub</dd></dl></li>
  <li class="node" data-node-id="3" data-x="-2028.5903301239014" data-y="-608.8465919494629">
    <span class="caption">Feature</span><dl class="properties"><dt>feature</dt><dd>evergreen</dd></dl></li>
  <li class="node" data-node-id="5" data-x="-1586.3723907470703" data-y="44.78205490112305">
    <span class="caption">FeatureSet</span>
  </li>
  <li class="node" data-node-id="9" data-x="-1549.8740997314453" data-y="-1406.8265972137451">
    <span class="caption">Plant</span><dl class="properties"><dt>name</dt><dd>Rhododendron</dd></dl></li>
  <li class="node" data-node-id="13" data-x="-1586.3723907470703" data-y="582.2714309692383">
    <span class="caption">Question</span><dl class="properties"><dt>id</dt><dd>1</dd><dt>q</dt><dd>Evergreen Shrubs</dd></dl></li>
  <li class="node" data-node-id="14" data-x="-2757.7551736831665" data-y="-641.2539367675781">
    <span class="caption">Feature</span><dl class="properties"><dt>feature</dt><dd>semi-evergreen</dd></dl></li>
  <li class="node" data-node-id="15" data-x="-2083.2289657592773" data-y="-1406.8265972137451">
    <span class="caption">Plant</span><dl class="properties"><dt>name</dt><dd>Buddleja</dd></dl></li>
  <li class="node" data-node-id="16" data-x="-1106.6406211853027" data-y="561.1691665649414">
    <span class="caption">Question</span><dl class="properties"><dt>id</dt><dd>2</dd><dt>q</dt><dd>Shrubs</dd></dl></li>
  <li class="node" data-node-id="17" data-x="-1106.6406211853027" data-y="44.78205490112305">
    <span class="caption">FeatureSet</span>
  </li>
  <li class="node" data-node-id="18" data-x="-547.4232482910156" data-y="561.1691665649414">
    <span class="caption">Question</span><dl class="properties"><dt>id</dt><dd>3</dd><dt>q</dt><dd>Deciduous Shrubs</dd></dl></li>
  <li class="node" data-node-id="19" data-x="-677.7754373550415" data-y="-1343.9512340724468">
    <span class="caption">Plant</span><dl class="properties"><dt>name</dt><dd>Cornus</dd></dl></li>
  <li class="node" data-node-id="29" data-x="573.2279720306396" data-y="-608.8465919494629">
    <span class="caption">Feature</span><dl class="properties"><dt>feature</dt><dd>winter-interest-flowers</dd></dl></li>
  <li class="node" data-node-id="30" data-x="164.77948760986328" data-y="-1343.9512340724468">
    <span class="caption">Plant</span><dl class="properties"><dt>name</dt><dd>Wych Hazel</dd></dl></li>
  <li class="relationship" data-from="1" data-to="0">
    <span class="type">INCLUDES</span>
  </li>
  <li class="relationship" data-from="1" data-to="2">
    <span class="type">INCLUDES</span>
  </li>
  <li class="relationship" data-from="5" data-to="3">
    <span class="type">INCLUDES</span>
  </li>
  <li class="relationship" data-from="5" data-to="2">
    <span class="type">INCLUDES</span>
  </li>
  <li class="relationship" data-from="9" data-to="2">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="9" data-to="3">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="13" data-to="5">
    <span class="type">FIND</span><dl class="properties"><dt>number</dt><dd>1</dd></dl></li>
  <li class="relationship" data-from="15" data-to="14">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="15" data-to="2">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="17" data-to="2">
    <span class="type">INCLUDES</span>
  </li>
  <li class="relationship" data-from="16" data-to="17">
    <span class="type">FIND</span><dl class="properties"><dt>number</dt><dd>3</dd></dl></li>
  <li class="relationship" data-from="18" data-to="1">
    <span class="type">FIND</span><dl class="properties"><dt>number</dt><dd>1</dd></dl></li>
  <li class="relationship" data-from="19" data-to="0">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="19" data-to="2">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="30" data-to="29">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="30" data-to="0">
    <span class="type">FEATURES</span>
  </li>
  <li class="relationship" data-from="30" data-to="2">
    <span class="type">FEATURES</span>
  </li>
</ul>

已更新

我用一个简单的示例更新了此示例,现在包含一个有限的创建脚本,箭头标记和SVG链接。

2 个答案:

答案 0 :(得分:0)

如果我正确理解,您就快到了,只有两件事了。

  1. 按重量排序,以便分组后保持该重量
  2. 每个问题创建和收集地图。收集({name:p.name,weight:weight})

    • 为什么在FIND上输入问题编号?

请尝试以下查询:

MATCH (q:Question)-[find:FIND]->(fs:FeatureSet)

WITH fs, q, find.number as qNum
//GET ALL THE FEATURE SETS
MATCH (fInSet:Feature)<-[inc:INCLUDES]-(fs)
WITH fs, q , qNum, COLLECT(fInSet.feature) as fInSetList
MATCH (p:Plant)-[:FEATURES]->(fOfPlant:Feature)
     WHERE fOfPlant.feature IN fInSetList
//Only plants matching all features
WITH fs, q , qNum, p, fInSetList, 
     size(fInSetList) as inputCnt, count(DISTINCT fOfPlant) as cnt
     WHERE cnt = inputCnt
WITH *
// order by cnt == weight
ORDER BY cnt DESC
RETURN
q.id,
'Give ' + qNum + ' examples of ' + q.q as Question,
fInSetList as plantFeaturesReqd,
// collect maps
COLLECT({plant: p.name, weight: cnt}) as plantsFound
ORDER BY q.id

答案 1 :(得分:0)

好吧,几个小时后。...

此查询为我提供了基本上要回答该问题的结果。

编辑:不幸的是,在更广泛的数据集上进行进一步测试后,排序仍无法按预期进行,因此此答案并不完整。

我现在可以根据自己的喜好为权重相同的植物自定义替代选择(到目前为止,我仅发现了无法正确执行此操作的方法,目前还无法处理这些情况)。

//SIMPLE RESULTS
MATCH (q:Question)-[find:FIND]->(fs:FeatureSet)

WITH fs, q, find.number as qNum
//GET ALL THE FEATURE SETS
MATCH (fInSet:Feature)<-[inc:INCLUDES]-(fs)
WITH fs, q , qNum, COLLECT(fInSet.feature) as fInSetList
MATCH (p:Plant)-[:FEATURES]->(fOfPlant:Feature)
WHERE fOfPlant.feature IN fInSetList
//Only plants matching all features
WITH fs, q , qNum, p, fInSetList, size(fInSetList) as inputCnt, count(DISTINCT fOfPlant) as cnt, 
collect(p.name) as pALL //** THIS IS IMPORTANT, BUT I DON'T UNDERSTAND QUITE WHY **//
WHERE cnt = inputCnt
WITH *, count(p) as pCnt
ORDER BY pCnt //** THIS RANKS BY WEIGHT BASED ON TOTAL OCCURRENCES **//
RETURN
q.id,
'Give ' + qNum + ' examples of ' + q.q as Question,
fInSetList as plantFeaturesReqd,
COLLECT(
p.name
)[0..qNum]  //** THIS LIMITS TO NUMBER REQUIRED PER QUESTION **//
as plantsFound
ORDER BY q.id
q.id    Question    plantFeaturesReqd   plantsFound
"1" "Give 1 examples of Evergreen Shrubs"   ["evergreen", "shrub"]  ["Rhododendron"]
"2" "Give 3 examples of Shrubs" ["shrub"]   ["Cornus", "Rhododendron", "Wych Hazel"]
"3" "Give 1 examples of Deciduous Shrubs"   ["deciduous", "shrub"]  ["Cornus"]