Question

情况：

我在airospike中有复杂的箱子，如：的object_id，状态，CREATE_TIME，end_at_time，status_client，assigned_to_id，created_by_id，is_s_provider，is_s_client，start_at_time，_id，END_TIME

我需要在任何bin字段上进行聚合。在sql格式中，它应该类似于：

select count(*) from table where status=13 and where is_s_provider=True;

经过一些研究后，我制作了lua模块，看起来像是：

function count(stream,created_by,status,status_client,obj,client,provider,assigned_to,create_time,end_time,start_at_time,end_at_time)                                                                                                                                                                                  
    local created_by_f = created_by_filter(created_by)                                                                                                                                                                                                                              
    local status_f = status_filter(status)                                                                                                                                                                                                                                          
    local status_client_f = status_client_filter(status_client)                                                                                                                                                                                                                      
    local obj_f = ojb_filter(obj)                                                                                                                                                                                                                                                   
    local client_f = client_filter(client)                                                                                                                                                                                                                                          
    local provider_f = provider_filter(provider)                                                                                                                                                                                                                                    
    local assigned_to_f = assigned_to_filter(assigned_to)
    local create_time_f= create_time_filter(create_time)
    local end_time_f = end_time_filter(end_time)                                                                                                                                                                                                                           
    local start_at_time_f = start_at_time_filter(start_at_time)
    local end_at_time_f = end_at_time_filter(end_at_time)

    function mapper(rec)                                                                                                                                                                                                                                                            
            return 1                                                                                                                                                                                                                                                                
    end                                                                                                                                                                                                                                                                             
    local function reducer(v1, v2)
        return v1 + v2
    end
    return stream : filter(created_by_f): filter(status_f): filter(status_client_f) : filter(obj_f): filter(client_f): filter(provider_f): filter(assigned_to_f): filter(create_time_f):filter(end_time_f): filter(start_at_time_f): filter(end_at_time_f): map(mapper) : reduce(reducer)
end

结束过滤器（我有11个）看起来像：

....
local function status_client_filter(status_client)
    local key = string.sub(status_client, 1, 1)                                                                                                                                                                                                                                       
    local data = string.sub(status_client,2)  
    return function(record)
        if status_client == '*' then
            return true
        elseif key == '!' then  
            if record['status_client'] ~= tonumber(data) then
                return true
            else
                return false
            end
        elseif key == '=' then
            if record['status_client'] == tonumber(data) then
                return true
            else
                return false
            end
        else
            return false
        end
    end
end
....

已创建

索引并在aql中检查它是否有效我运行：

aql> aggregate count.count('*','*','=13','*','*','*','*','*','*','*','*') on test.demo 
+-------+
| count |
+-------+
| 895   |
+-------+
1 row in set (0.219 secs)

aql>

一切正常，我得到我想要的，除了一个大问题，0.219秒很多。

问题：

有没有办法在条件满足时跳过过滤器，例如，如果我传递给过滤器函数status_client_filter（'*'），那么流过滤器函数不应该遍历所有记录，而是在它们来自流函数之前传递它们。它应该会提高很多性能。或者是另一种动态过滤方式吗？还是另一种复杂聚合架构？

Answer 1

您的申请可以在此处提供帮助。过滤阶段不是强制性的。因此，您可以在没有过滤阶段的情况下拥有一个更高级别的lua功能。您的应用程序可以检测过滤器是否为*并调用无过滤器的lua函数。此外，如果根本没有where条款，使用基于扫描的聚合比基于二级索引的聚合更好。扫描比二级索引查询快得多。

Answer 2

从release 3.12开始，您将在使用predicate filtering调用UDF之前进行过滤。目前，Java，C，C# Go和其他客户可以使用此功能。

这将允许更轻，更快的UDF依赖于谓词表达式完成的预过滤。

如果满足条件，Aerospike udf lua stream跳过过滤器

2 个答案: