有没有人知道ArangoDB是否支持分面搜索以及性能与支持它的其他产品(例如Solr,MarkLogic)或不支持它的产品(例如Mongo)相比如何?
在搜索网站,阅读文档以及搜索Google网上论坛之后,我认为不会在任何地方进行讨论。
由于
答案 0 :(得分:12)
ArangoDB有一种查询语言,支持分组查询。这允许您实现分面搜索。为了确定我们对分面搜索有相同的理解,让我解释一下,我认为它是什么意思。例如,您有一个产品清单。每个产品都有一些属性(例如名称,型号)和一些类别(例如制造商)。然后我可以搜索包含单词的名称或名称。这将列出所有产品以及指示在哪个类别中有多少产品。那是你的意思吗?
所以举例:假设您有三个属性(name,attribute1,attribute2)和两个类别(category1,category2)的文档:
> for (i = 0; i < 10000; i++) db.products.save({category1: i % 5, category2: i % 7, attribute1: i % 13, attribute2: i % 17, name: "Lore Ipsum " + i, productId: i})
所以典型的文件是:
> db.products.any()
{
"_id" : "products/8788564659",
"_rev" : "8788564659",
"_key" : "8788564659",
"productId" : 9291,
"category1" : 1,
"category2" : 2,
"attribute1" : 9,
"attribute2" : 9,
"name" : "Lore Ipsum 9291"
}
如果要搜索属性1在2和3(含)之间的所有文档,可以使用
> db._query("FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name LIMIT 3 RETURN p").toArray();
[
{
"_id" : "products/7159077555",
"_rev" : "7159077555",
"_key" : "7159077555",
"productId" : 1003,
"category1" : 3,
"category2" : 2,
"attribute1" : 2,
"attribute2" : 0,
"name" : "Lore Ipsum 1003"
},
{
"_id" : "products/7159274163",
"_rev" : "7159274163",
"_key" : "7159274163",
"productId" : 1004,
"category1" : 4,
"category2" : 3,
"attribute1" : 3,
"attribute2" : 1,
"name" : "Lore Ipsum 1004"
},
{
"_id" : "products/7161633459",
"_rev" : "7161633459",
"_key" : "7161633459",
"productId" : 1016,
"category1" : 1,
"category2" : 1,
"attribute1" : 2,
"attribute2" : 13,
"name" : "Lore Ipsum 1016"
}
]
或者如果您只对产品标识感兴趣
> db._query("FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name LIMIT 3 RETURN p.productId").toArray();
[
1003,
1004,
1016
]
现在要获得针对category1
的方面> db._query("LET l = (FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name RETURN p) return [ slice(l,@skip,@count), (FOR p in l collect c1 = p.category1 INTO g return { category1: c1, count: length(g[*].p)}) ]", { skip: 0, count: 3 }).toArray()
[
[
[
{
"_id" : "products/7159077555",
"_rev" : "7159077555",
"_key" : "7159077555",
"productId" : 1003,
"category1" : 3,
"category2" : 2,
"attribute1" : 2,
"attribute2" : 0,
"name" : "Lore Ipsum 1003"
},
{
"_id" : "products/7159274163",
"_rev" : "7159274163",
"_key" : "7159274163",
"productId" : 1004,
"category1" : 4,
"category2" : 3,
"attribute1" : 3,
"attribute2" : 1,
"name" : "Lore Ipsum 1004"
},
{
"_id" : "products/7161633459",
"_rev" : "7161633459",
"_key" : "7161633459",
"productId" : 1016,
"category1" : 1,
"category2" : 1,
"attribute1" : 2,
"attribute2" : 13,
"name" : "Lore Ipsum 1016"
}
],
[
{
"category1" : 0,
"count" : 307
},
{
"category1" : 1,
"count" : 308
},
{
"category1" : 2,
"count" : 308
},
{
"category1" : 3,
"count" : 308
},
{
"category1" : 4,
"count" : 308
}
]
]
]
要向下钻取到category1并使用facet进行类别2:
> db._query("LET l = (FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 && p.category1 == 1 SORT p.name RETURN p) return [ slice(l,@skip,@count), (FOR p in l collect c2 = p.category2 INTO g return { category2: c2, count: length(g[*].p)}) ]", { skip: 0, count: 3 }).toArray()
[
[
[
{
"_id" : "products/7161633459",
"_rev" : "7161633459",
"_key" : "7161633459",
"productId" : 1016,
"category1" : 1,
"category2" : 1,
"attribute1" : 2,
"attribute2" : 13,
"name" : "Lore Ipsum 1016"
},
{
"_id" : "products/7169497779",
"_rev" : "7169497779",
"_key" : "7169497779",
"productId" : 1056,
"category1" : 1,
"category2" : 6,
"attribute1" : 3,
"attribute2" : 2,
"name" : "Lore Ipsum 1056"
},
{
"_id" : "products/6982720179",
"_rev" : "6982720179",
"_key" : "6982720179",
"productId" : 106,
"category1" : 1,
"category2" : 1,
"attribute1" : 2,
"attribute2" : 4,
"name" : "Lore Ipsum 106"
}
],
[
{
"category2" : 0,
"count" : 44
},
{
"category2" : 1,
"count" : 44
},
{
"category2" : 2,
"count" : 44
},
{
"category2" : 3,
"count" : 44
},
{
"category2" : 4,
"count" : 44
},
{
"category2" : 5,
"count" : 44
},
{
"category2" : 6,
"count" : 44
}
]
]
]
为了使搜索字符串更加用户友好,有必要在Javascript中编写一些小帮助函数。我认为支持小组https://groups.google.com/forum/#!forum/arangodb是讨论您的要求的正确位置。