Question

我希望尽可能高效地获取redis密钥列表。我们可以在redis服务器上对此进行建模，但是我们喜欢这样，这就是解决问题的正确方法。让我来描述一下情况。

假设存储在Redis中的大量“客户”作为字符串。

customer__100000
customer__100001
customer__100002

每个客户都有很多属性。其中包括他们居住的城市。每个城市也都存储在Redis中。

city__New York
city__San Francisco
city__Washington DC

通过一个不同的过程，我将得到一组客户键（预过滤器的相交集）。一旦我有了这些键，我需要找出我所拥有的不同城市那些客户。我的最终目标是获取城市的名称，但是如果我能获得钥匙，我可以拉出城市名称也很好。

为了说明我在这里谈论的规模，假设我们正在处理200-300k客户，其中有大约70个属性（城市是其中之一），每个属性在50到100,000之间。我希望尽可能保持高效。

Answer 1

不应将客户存储为字符串，而应将其存储为哈希值。 Redis对哈希的ziplist编码非常节省空间。如果您要存储70多个元素，则应考虑在redis.conf中提高hash-max-ziplist-entries限制

使用Redis哈希时，您可以使用SORT做有趣的事情。通过SORT与GET和STORE一起使用，您可以从客户那里获取所有城市，并将其存储为列表（不同）。然后，您可以通过在列表上调用lpop和sadd将列表转换为集合。

以下是Redis Lua脚本的示例：

-- a key which holds a set of customer keys
local set_of_customer_keys = KEYS[1]
-- a maybe-existing key which will hold the set of cities
local distinct_set = ARGV[1]
-- attribute to get (defaults to city)
local attribute = ARGV[2] or 'city'
-- remove current set of distinct_cities
redis.call("DEL", distinct_set)
-- use SORT to build a list out of customer hash values for `attribute` 
local cities = redis.call("SORT", set_of_customer_keys, "BY", "nosort", "GET", "*->"..attribute)
-- loop through all cities in the list and add them to the distinct cities set
for i, city in pairs(cities) do
  redis.call("SADD", distinct_set, city)
end
-- return the distinct cities
return redis.call("SMEMBERS", distinct_set)

您还可以保留永久存储的customer__100000__cities集以及客户的属性，并使用sinter *customer_cities_keys来获取一组不同的城市，但这样会降低内存效率。

如何在Redis中交叉和查找不同的键

1 个答案: