如何使用Redis将搜索文本与其他条件相结合?

时间:2016-07-06 18:23:12

标签: lua redis

我使用Redis成功编写了文本搜索和其他条件的交集。为了达到这个目的,我使用的是Lua脚本。问题是我不仅要阅读,还要从该脚本中编写值。从Redis 3.2可以通过调用redis.replicate_commands()来实现这一目标,但不能在3.2之前实现。

以下是我存储值的方法。

姓名

> HSET product:name 'Cool product' 1
> HSET product:name 'Nice product' 2

价格

> ZADD product:price 49.90 1
> ZADD product:price 54.90 2

然后,为了获得与'ice'匹配的所有产品,我打电话给:

> HSCAN product:name 0 MATCH *ice*

但是,由于HSCAN使用游标,我必须多次调用它才能获取所有结果。这是我使用Lua脚本的地方:

local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'

repeat
    local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
    cursor = tonumber(result[1])
    fields = result[2]
    for i, id in ipairs(fields) do
        if i % 2 == 0 then
            ids[#ids + 1] = id
        end
    end
until cursor == 0
return ids

由于无法将脚本的结果用于其他通话,例如SADD key EVAL(SHA) ...。而且,在脚本中不可能使用全局变量。我改变了字段内的部分'循环访问脚本外的ID列表:

if i % 2 == 0 then
    ids[#ids + 1] = id
    redis.call('SADD', KEYS[1], id)
end

我必须在第一行添加redis.replicate_commands()。通过此更改,我可以从调用脚本时传递的密钥中获取所有ID(请参阅KEYS[1])。

最后,要获得一个列表100个产品ID,其价格在40到50之间,其名称包含" ice",我会执行以下操作:

> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREMRANGEBYSCORE tmp:price 0 40
> ZREMRANGEBYSCORE tmp:price 50 +INF
> EVALSHA b81c2b... 1 tmp:name ice
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100

我使用ZCOUNT调用预先知道我有多少结果页面,count / 100

正如我之前所说,这与Redis 3.2很好地配合。但是当我尝试在AWS上运行代码时,它只支持Redis高达2.8,我无法再让它工作了。我不确定如何在不使用脚本或不从脚本编写的情况下使用HSCAN光标进行迭代。有一种方法可以使它在Redis 2.8上运行吗?

一些注意事项:

  1. 我知道我可以在Redis之外进行部分处理(比如迭代光标或交叉匹配),但它会影响应用程序的整体性能。
  2. 我不想自己部署Redis实例来使用3.2版本。
  3. 上面的标准(价格范围和名称)只是一个简单的例子。我有其他领域和比赛类型,不仅仅是那些。
  4. 我不确定我存储数据的方式是否是最佳方式。我愿意听取有关它的建议。

2 个答案:

答案 0 :(得分:2)

我在这里找到的唯一问题是将值存储在lua scirpt中。因此,不要将它们存储在lua中,而是将该值放在lua之外(返回string []的值)。使用sadd(key,members [])将它们存储在不同调用的集合中。然后继续交叉并返回结果。

> ZUNIONSTORE tmp:price 1 product:price WEIGHTS 1
> ZREVRANGEBYSCORE tmp:price 0 40
> ZREVRANGEBYSCORE tmp:price 50 +INF
> nameSet[] = EVALSHA b81c2b... 1 ice 
> SADD tmp:name nameSet
> ZINTERSTORE tmp:result tmp:price tmp:name
> ZCOUNT tmp:result -INF +INF
> ZRANGE tmp:result 0 100

IMO您的设计是最理想的设计。一个建议是尽可能使用管道,因为它可以一次处理所有内容。

希望这有帮助

<强>更新 在lua中没有像array([]这样的东西你必须使用lua表来实现它。在你的脚本中,你正确地返回id,它本身就是一个数组,你可以将它作为一个单独的调用来实现sadd。

String [] nameSet = (String[]) evalsha b81c2b... 1 ice -> This is in java
SADD tmp:name nameSet

相应的lua脚本与第一个脚本相同。

local cursor = 0
local fields = {}
local ids = {}
local key = 'product:name'
local value = '*' .. ARGV[1] .. '*'

repeat
    local result = redis.call('HSCAN', key, cursor, 'MATCH', value)
    cursor = tonumber(result[1])
    fields = result[2]
    for i, id in ipairs(fields) do
        if i % 2 == 0 then
            ids[#ids + 1] = id
        end
    end
until cursor == 0
return ids

答案 1 :(得分:1)

The problem isn't that you're writing to the database, it's that you're doing a write after a HSCAN, which is a non-deterministic command.

In my opinion there's rarely a good reason to use a SCAN command in a Lua script. The main purpose of the command is to allow you to do things in small batches so you don't lock up the server processing a huge key space (or hash key space). Since scripts are atomic, though, using HSCAN doesn't help—you're still locking up the server until the whole thing's done.

Here are the options I can see:

If you can't risk locking up the server with a lengthy command:

  1. Use HSCAN on the client. This is the safest option, but also the slowest.

If you're want to do as much processing in a single atomic Lua command as possible:

  1. Use Redis 3.2 and script effects replication.
  2. Do the scanning in the script, but return the values to the client and initiate the write from there. (That is, Karthikeyan Gopall's answer.)
  3. Instead of HSCAN, do an HKEYS in the script and filter the results using Lua's pattern matching. Since HKEYS is deterministic you won't have a problem with the subsequent write. The downside, of course, is that you have to read in all of the keys first, regardless of whether they match your pattern. (Though HSCAN is also O(N) in the size of the hash.)