ArangoDB是否有分面搜索?

时间:2014-03-13 11:49:47

标签: arangodb

有没有人知道ArangoDB是否支持分面搜索以及性能与支持它的其他产品(例如Solr,MarkLogic)或不支持它的产品(例如Mongo)相比如何?

在搜索网站,阅读文档以及搜索Google网上论坛之后,我认为不会在任何地方进行讨论。

由于

1 个答案:

答案 0 :(得分:12)

ArangoDB有一种查询语言,支持分组查询。这允许您实现分面搜索。为了确定我们对分面搜索有相同的理解,让我解释一下,我认为它是什么意思。例如,您有一个产品清单。每个产品都有一些属性(例如名称,型号)和一些类别(例如制造商)。然后我可以搜索包含单词的名称或名称。这将列出所有产品以及指示在哪个类别中有多少产品。那是你的意思吗?

所以举例:假设您有三个属性(name,attribute1,attribute2)和两个类别(category1,category2)的文档:

> for (i = 0; i < 10000; i++) db.products.save({category1: i % 5, category2: i % 7, attribute1: i % 13, attribute2: i % 17, name: "Lore Ipsum " + i, productId: i})

所以典型的文件是:

> db.products.any()
{
  "_id" : "products/8788564659",
  "_rev" : "8788564659",
  "_key" : "8788564659",
  "productId" : 9291,
  "category1" : 1,
  "category2" : 2,
  "attribute1" : 9,
  "attribute2" : 9,
  "name" : "Lore Ipsum 9291"
}

如果要搜索属性1在2和3(含)之间的所有文档,可以使用

> db._query("FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name LIMIT 3 RETURN p").toArray();
[
  {
    "_id" : "products/7159077555",
    "_rev" : "7159077555",
    "_key" : "7159077555",
    "productId" : 1003,
    "category1" : 3,
    "category2" : 2,
    "attribute1" : 2,
    "attribute2" : 0,
    "name" : "Lore Ipsum 1003"
  },
  {
    "_id" : "products/7159274163",
    "_rev" : "7159274163",
    "_key" : "7159274163",
    "productId" : 1004,
    "category1" : 4,
    "category2" : 3,
    "attribute1" : 3,
    "attribute2" : 1,
    "name" : "Lore Ipsum 1004"
  },
  {
    "_id" : "products/7161633459",
    "_rev" : "7161633459",
    "_key" : "7161633459",
    "productId" : 1016,
    "category1" : 1,
    "category2" : 1,
    "attribute1" : 2,
    "attribute2" : 13,
    "name" : "Lore Ipsum 1016"
  }
]

或者如果您只对产品标识感兴趣

> db._query("FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name LIMIT 3 RETURN p.productId").toArray();
[
  1003,
  1004,
  1016
]

现在要获得针对category1

的方面
>  db._query("LET l = (FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 SORT p.name RETURN p) return [ slice(l,@skip,@count), (FOR p in l collect c1 = p.category1 INTO g return { category1: c1, count: length(g[*].p)}) ]", { skip: 0, count: 3 }).toArray()
[
  [
    [
      {
        "_id" : "products/7159077555",
        "_rev" : "7159077555",
        "_key" : "7159077555",
        "productId" : 1003,
        "category1" : 3,
        "category2" : 2,
        "attribute1" : 2,
        "attribute2" : 0,
        "name" : "Lore Ipsum 1003"
      },
      {
        "_id" : "products/7159274163",
        "_rev" : "7159274163",
        "_key" : "7159274163",
        "productId" : 1004,
        "category1" : 4,
        "category2" : 3,
        "attribute1" : 3,
        "attribute2" : 1,
        "name" : "Lore Ipsum 1004"
      },
      {
        "_id" : "products/7161633459",
        "_rev" : "7161633459",
        "_key" : "7161633459",
        "productId" : 1016,
        "category1" : 1,
        "category2" : 1,
        "attribute1" : 2,
        "attribute2" : 13,
        "name" : "Lore Ipsum 1016"
      }
    ],
    [
      {
        "category1" : 0,
        "count" : 307
      },
      {
        "category1" : 1,
        "count" : 308
      },
      {
        "category1" : 2,
        "count" : 308
      },
      {
        "category1" : 3,
        "count" : 308
      },
      {
        "category1" : 4,
        "count" : 308
      }
    ]
  ]
]

要向下钻取到category1并使用facet进行类别2:

>  db._query("LET l = (FOR p IN products FILTER p.attribute1 >= 2 && p.attribute1 <= 3 && p.category1 == 1 SORT p.name RETURN p) return [ slice(l,@skip,@count), (FOR p in l collect c2 = p.category2 INTO g return { category2: c2, count: length(g[*].p)}) ]", { skip: 0, count: 3 }).toArray()
[
  [
    [
      {
        "_id" : "products/7161633459",
        "_rev" : "7161633459",
        "_key" : "7161633459",
        "productId" : 1016,
        "category1" : 1,
        "category2" : 1,
        "attribute1" : 2,
        "attribute2" : 13,
        "name" : "Lore Ipsum 1016"
      },
      {
        "_id" : "products/7169497779",
        "_rev" : "7169497779",
        "_key" : "7169497779",
        "productId" : 1056,
        "category1" : 1,
        "category2" : 6,
        "attribute1" : 3,
        "attribute2" : 2,
        "name" : "Lore Ipsum 1056"
      },
      {
        "_id" : "products/6982720179",
        "_rev" : "6982720179",
        "_key" : "6982720179",
        "productId" : 106,
        "category1" : 1,
        "category2" : 1,
        "attribute1" : 2,
        "attribute2" : 4,
        "name" : "Lore Ipsum 106"
      }
    ],
    [
      {
        "category2" : 0,
        "count" : 44
      },
      {
        "category2" : 1,
        "count" : 44
      },
      {
        "category2" : 2,
        "count" : 44
      },
      {
        "category2" : 3,
        "count" : 44
      },
      {
        "category2" : 4,
        "count" : 44
      },
      {
        "category2" : 5,
        "count" : 44
      },
      {
        "category2" : 6,
        "count" : 44
      }
    ]
  ]
]

为了使搜索字符串更加用户友好,有必要在Javascript中编写一些小帮助函数。我认为支持小组https://groups.google.com/forum/#!forum/arangodb是讨论您的要求的正确位置。