我们的数据存储在MongoDB 2.4.8中,并使用ElasticSearch MongoDB River 1.7.3索引到ElasticSearch 0.90.7。
我们的数据正确索引,我可以成功搜索我们想要搜索的字段。但我还需要过滤权限 - 当然我们只想返回调用用户实际可以读取的结果。
在我们服务器上的代码中,我将调用用户的授权作为数组,例如:
[ "Role:REGISTERED_USER", "Account:52c74b25da06f102c90d52f4", "Role:USER", "Group:52cb057cda06ca463e78f0d7" ]
我们正在搜索的单位数据示例如下:
{
"_id" : ObjectId("52dffbd6da06422559386f7d"),
"content" : "various stuff",
"ownerId" : ObjectId("52d96bfada0695fcbdb41daf"),
"acls" : [
{
"accessMap" : {},
"sourceClass" : "com.bulb.learn.domain.units.PublishedPageUnit",
"sourceId" : ObjectId("52dffbd6da06422559386f7d")
},
{
"accessMap" : {
"Role:USER" : {
"allow" : [
"READ"
]
},
"Account:52d96bfada0695fcbdb41daf" : {
"allow" : [
"CREATE",
"READ",
"UPDATE",
"DELETE",
"GRANT"
]
}
},
"sourceClass" : "com.bulb.learn.domain.units.CompositeUnit",
"sourceId" : ObjectId("52dffb54da06422559386f57")
}
]
}
在上面的示例数据中,我已将所有可搜索的内容替换为
"content" : "various stuff"
授权数据位于“acls”数组中。我需要编写的过滤器将执行以下操作(英文):
pass all units where the "acls" array
contains an "accessMap" object
that contains a property whose name is one of the user's authorization strings
and whose "allow" property contains "READ"
and whose "deny" property does not contain "READ"
在上面的示例中,用户具有“Role:USER”授权,并且此单元的accessMap具有“Role:USER”,其中包含“allow”,其中包含“READ”和“Role:USER”不包含“拒绝”。所以这个单位会通过过滤器。
我没有看到如何使用ElasticSearch为此编写过滤器。
我的印象是有两种方法可以处理嵌套数组:“嵌套”或“has_child”(或“has_parent”)。
我们不愿意使用“嵌套”过滤器,因为它显然要求在任何数据更改时重新索引整个块。可搜索的内容和授权数据可以随时更改,以响应用户的操作。
在我看来,为了使用“has_child”或“has_parent”,授权数据必须与单元数据(在不同的集合中?)分开,并且当节点被索引时,它会必须指定其父或子。我不知道ElasticSearch MongoDB River是否能够做到这一点。
这甚至可能吗?或者我们应该重新安排授权数据吗?
答案 0 :(得分:9)
您需要重新调整数据。
在密钥中使用Elasticsearch存在问题。它最终将作为一个单独的字段,你将拥有一个不断增长的映射,因此也是集群状态。
您可能希望将accessMap作为对象列表,使用当前作为值的键。然后,它必须嵌套。否则,您无法知道匹配允许属于哪个accessMap。
ACL是否应该嵌套(导致嵌套的两个级别)或父子级取决于更新各种对象的频率。通过将它们作为对象的嵌套文档,您可以支付每次更新时加入的成本。如果你做亲子,你需要在每次搜索时支付加入费用。
这很快变得复杂,所以我准备了一个简化的可运行的例子,你可以玩:https://www.found.no/play/gist/8582654
请注意nested
- 和bool
- 过滤器是如何嵌套的。将两个嵌套在一个bool中是行不通的。
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
"settings": {
"analysis": {}
},
"mappings": {
"type": {
"properties": {
"acls": {
"type": "nested",
"properties": {
"accessMap": {
"type": "nested",
"properties": {
"allow": {
"type": "string",
"index": "not_analyzed"
},
"deny": {
"type": "string",
"index": "not_analyzed"
},
"key": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type","_id":1}}
{"acls":[{"accessMap":[{"key":"Role:USER","allow":["READ"]},{"key":"Account:52d96bfada0695fcbdb41daf","allow":["READ","UPDATE"]}]}]}
{"index":{"_index":"play","_type":"type","_id":2}}
{"acls":[{"accessMap":[{"key":"Role:USER","allow":["READ"]},{"key":"Account:52d96bfada0695fcbdb41daf","deny":["READ","UPDATE"]}]}]}
{"index":{"_index":"play","_type":"type","_id":3}}
{"acls":[{"accessMap":[{"key":"Role:USER","allow":["READ"]},{"key":"Account:52d96bfada0695fcbdb41daf","allow":["READ","UPDATE"]}]}]}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"filtered": {
"filter": {
"nested": {
"path": "acls",
"filter": {
"bool": {
"must": {
"nested": {
"path": "acls.accessMap",
"filter": {
"bool": {
"must": [
{
"term": {
"allow": "READ"
}
},
{
"terms": {
"key": [
"Role:USER",
"Account:52d96bfada0695fcbdb41daf"
]
}
}
]
}
}
}
},
"must_not": {
"nested": {
"path": "acls.accessMap",
"filter": {
"bool": {
"must": [
{
"term": {
"deny": "READ"
}
},
{
"terms": {
"key": [
"Role:USER",
"Account:52d96bfada0695fcbdb41daf"
]
}
}
]
}
}
}
}
}
}
}
}
}
}
}
'
为了完整性,以下是父子方法的类似示例:https://www.found.no/play/gist/8586840
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
"settings": {
"analysis": {}
},
"mappings": {
"acl": {
"_parent": {
"type": "document"
},
"properties": {
"acls": {
"properties": {
"accessMap": {
"type": "nested",
"properties": {
"key": {
"type": "string",
"index": "not_analyzed"
},
"allow": {
"type": "string",
"index": "not_analyzed"
},
"deny": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"document","_id":1}}
{"title":"Doc 1"}
{"index":{"_index":"play","_type":"acl","_parent":1}}
{"acls":[{"accessMap":[{"key":"Role:USER","allow":["READ"]},{"key":"Account:52d96bfada0695fcbdb41daf","allow":["READ","UPDATE"]}]}]}
{"index":{"_index":"play","_type":"document","_id":2}}
{"title":"Doc 2"}
{"index":{"_index":"play","_type":"acl","_parent":2}}
{"acls":[{"accessMap":[{"key":"Role:USER","allow":["READ"]},{"key":"Account:52d96bfada0695fcbdb41daf","deny":["READ","UPDATE"]}]}]}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"filtered": {
"filter": {
"has_child": {
"type": "acl",
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "acls.accessMap",
"filter": {
"bool": {
"must": [
{
"terms": {
"key": [
"Role:USER",
"Account:52d96bfada0695fcbdb41daf"
]
}
},
{
"term": {
"allow": "READ"
}
}
]
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "acls.accessMap",
"filter": {
"bool": {
"must": [
{
"terms": {
"key": [
"Role:USER",
"Account:52d96bfada0695fcbdb41daf"
]
}
},
{
"term": {
"deny": "READ"
}
}
]
}
}
}
}
]
}
}
}
}
}
}
}
'
答案 1 :(得分:-5)
谢谢@Alex Brasetvik,你的建议是制作主题ID数据而不是密钥,你的嵌套解释是“每次更新加入”,但是亲子是“按查询加入”,大多数是有帮助的。
我发现我必须“取消嵌套”数据才能使用父子方法,我们更愿意保持授权数据的嵌套。
我不明白你的意思是“将两个嵌套在一个bool中是行不通的。”
以下是我重构数据的方法:
{
"_id" : ObjectId("52dffbd6da06422559386f7d"),
"content" : "various stuff",
"ownerId" : ObjectId("52d96bfada0695fcbdb41daf"),
"accessMaps" : [
{
"sourceClass" : "com.bulb.learn.domain.units.PublishedPageUnit",
"sourceId" : ObjectId("52dffbd6da06422559386f7d")
},
{
"allow" : {
"CREATE" : [
"Account:52d96bfada0695fcbdb41daf"
],
"READ" : [
"Account:52d96bfada0695fcbdb41daf",
"Role:USER"
],
"UPDATE" : [
"Account:52d96bfada0695fcbdb41daf"
],
"DELETE" : [
"Account:52d96bfada0695fcbdb41daf"
],
"GRANT" : [
"Account:52d96bfada0695fcbdb41daf"
]
},
"deny" : {},
"sourceClass" : "com.bulb.learn.domain.units.CompositeUnit",
"sourceId" : ObjectId("52dffb54da06422559386f57")
}
]
}
新映射如下所示:
{
"unit": {
"properties": {
"accessMaps": {
"type": "nested",
"properties": {
"allow": {
"type": "nested",
"properties": {
"CREATE": {
"type": "string",
"index": "not_analyzed",
},
"DELETE": {
"type": "string",
"index": "not_analyzed",
},
"GRANT": {
"type": "string",
"index": "not_analyzed",
},
"READ": {
"type": "string",
"index": "not_analyzed",
},
"UPDATE": {
"type": "string",
"index": "not_analyzed",
}
}
},
"deny": {
"type": "nested",
"properties": {
"CREATE": {
"type": "string",
"index": "not_analyzed",
},
"DELETE": {
"type": "string",
"index": "not_analyzed",
},
"GRANT": {
"type": "string",
"index": "not_analyzed",
},
"READ": {
"type": "string",
"index": "not_analyzed",
},
"UPDATE": {
"type": "string",
"index": "not_analyzed",
}
}
},
"sourceClass": {
"type": "string"
},
"sourceId": {
"type": "string"
}
}
}
}
}
}
过滤后的查询如下所示:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"nested": {
"path": "accessMaps.allow",
"filter": {
"terms": {
"accessMaps.allow.READ": [
"Role:REGISTERED_USER",
"Account:52e6a361da06e4eb64172519",
"Role:USER",
"Group:52cb057cda06ca463e78f0d7"
]
}
}
}
},
"must_not": {
"nested": {
"path": "accessMaps.deny",
"filter": {
"terms": {
"accessMaps.deny.READ": [
"Role:REGISTERED_USER",
"Account:52e6a361da06e4eb64172519",
"Role:USER",
"Group:52cb057cda06ca463e78f0d7"
]
}
}
}
}
}
}
}
}
}
我遇到的最大问题是如何在嵌套过滤器中使用“path”属性,并且术语过滤器中的字段名称必须是完全限定的。我希望ElasticSearch能够在他们的文档中投入更多精力。