StackExchange.Redis与Azure:LockRelease表示锁被释放但它似乎不是

时间:2016-05-04 17:09:33

标签: c# azure redis stackexchange.redis azure-redis-cache

稍微在我的绳索结束时:)我有一个原型(太大,有太多的依赖关系要分享),因为多种原因使用redis - 其中一个是存储序列化的值,并通过在单独的密钥上使用LockTake/Release的保护锁来控制对该值的更新。

整个应用程序看起来有点像这样(注意:此代码片段无法重现我的问题!):

using Nito.AsyncEx;
using StackExchange.Redis;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace RedisAzureLockingTest
{
    class Program
    {
        static void Main(string[] args)
        {
            AsyncContext.Run(async () =>
            {

                var cm = await ConnectionMultiplexer.ConnectAsync("blah:6379,ssl=false,password=blah,defaultDatabase=1,syncTimeout=5000");
                var db = cm.GetDatabase();

                // store key
                RedisKey key = "thisisourtest";
                await db.StringSetAsync(key, "initial value", flags: CommandFlags.DemandMaster); // SET

                // acquire lock
                RedisKey lockkey = "thisisourtest.lock";
                string locktoken = Guid.NewGuid().ToString();
                bool success = await db.LockTakeAsync(lockkey, locktoken, TimeSpan.FromDays(1), CommandFlags.DemandMaster);
                if (!success) throw new InvalidOperationException("Sure ok - lock couldnt be taken");

                try
                {
                    // do some stuff whilst the lock is taken

                    var oldval = await db.StringGetAsync(key, CommandFlags.DemandMaster);
                    if (oldval.IsNullOrEmpty) throw new InvalidOperationException("Key doesnt exist");

                    // persist an update
                    var newval = Guid.NewGuid().ToString();
                    await db.StringSetAsync(key, newval, flags: CommandFlags.DemandMaster); // SET
                }
                finally
                {
                    // release lock
                    if (!await db.LockReleaseAsync(lockkey, locktoken, CommandFlags.DemandMaster))
                        throw new InvalidOperationException("Should never occur - we couldnt release our own lock  is now locked forever!");

                    // double check that the lock has been released
                    var locktok2 = await db.LockQueryAsync(lockkey, CommandFlags.DemandMaster);
                    if (locktok2.HasValue) throw new InvalidOperationException("Should never occur - we couldnt release our own lock is now locked forever! Even worse- lock release lied about releasing itself");
                }

                Console.WriteLine("WORKED");
            });

            Console.ReadLine();
        }
    }
}

我一直在使用一个简单的redis实例进行本地测试,从未遇到过任何问题,现在我已经开始在另一个环境中尝试并使用Azure C0 Basic实例。半可靠(我设法现在设置我的代码库的本地副本指向Azure实例)我可以重现问题 - 但不知道可能出错的地方或如何进一步调试问题。

我观察到的行为是:

  1. LockTakeAsync工作正常
  2. 我做的事情'位执行正常
  3. LockReleaseAsync似乎成功(返回TRUE)但锁定键未从redis中删除(使用cmdline redis-cli工具确认)。
  4. 我试过了:

    • 使用ConnectionMultiplexer的跟踪 - 没有任何不良信息显示
    • 将我的应用程序简化为一个简单的测试用例(见上文) - 但这不会重现问题,因此它必须是外部的
    • 切换到LockTake/LockRelease来电的非异步版本 - 问题仍然存在
    • 添加了LockReleaseAsync来电记录(包括SET" DEBUG"调用redis跟踪 - 见下文)以确认事件的确切顺序
    • 在上面的代码段中添加了LockQueryAsync来电以确认锁定仍然存在!
    • 指定我的所有命令都应在主副本上执行。

    唯一能够在redis实例上运行MONITOR并捕获SE.Redis正在执行的操作的痕迹。当这在本地执行并且应用程序正常工作时,我得到类似的(键名更改):

    1462360519.322029 [1 37.157.34.228:1995] "SET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock" "647672fd-ae06-4b6e-be67-341ac583a366" "EX" "86400" "NX"
    1462360519.332884 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb"
    1462360519.342668 [1 37.157.34.228:1995] "SET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb" "ChIJHcrmkwgwX0cRkssud4TmJcsQARoSCQAAAAAAAAAAEQADAAAAAAAHIgNHQlAyCQi+koDo9oL9GToJCL6SgOj2gv0ZQgkIvvqI77mD/Rk="
    1462360519.354847 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
    1462360519.364666 [1 37.157.34.228:1995] "SET" "DEBUG" "1"
    1462360519.387834 [1 37.157.34.228:1995] "WATCH" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
    1462360519.387866 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
    1462360519.401686 [1 37.157.34.228:1995] "MULTI"
    1462360519.401708 [1 37.157.34.228:1995] "DEL" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
    1462360519.401726 [1 37.157.34.228:1995] "EXEC"
    1462360519.414845 [1 37.157.34.228:1995] "SELECT" "1"
    1462360519.414862 [1 37.157.34.228:1995] "SET" "DEBUG" "2"
    1462360519.424950 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb.lock"
    1462360519.452993 [1 37.157.34.228:1995] "GET" "thisisourtest.93e6ca1d-3008-475f-92cb-2e7784e625cb"
    

    当我针对Azure运行并重新创建问题时,我得到:

    1462359810.253275 [1 23.97.166.137:1277] "SET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock" "35be6e88-7240-4772-ac2d-220a57ed1a79" "EX" "86400" "NX"
    1462359810.256639 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69"
    1462359810.258605 [1 23.97.166.137:1277] "SET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69" "ChIJsxlhwj7M8UgRiDRsex06+2kQARoSCQAAAAAAAAAAEQADAAAAAAADIgNHQlAyCQi1nJ6N3IL9GToJCLWcno3cgv0ZQgkItYSnlJ+D/Rk="
    1462359810.260233 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
    1462359810.262790 [1 23.97.166.137:1277] "SET" "DEBUG" "1"
    1462359810.283693 [1 23.97.166.137:1277] "WATCH" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
    1462359810.283724 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
    1462359812.329321 [0 23.97.166.137:1257] "UNSUBSCRIBE" "U\xc7\n\xae\xa7\x1c\x84K\x8f\x1ft\x00\\j\xc2j"
    1462359812.329374 [3 23.97.166.137:1256] "INFO" "replication"
    1462359812.357770 [1 23.97.166.137:1259] "INFO" "replication"
    1462359814.186895 [0 23.97.166.137:1312] "INFO" "replication"
    1462359815.285593 [1 23.97.166.137:1277] "UNWATCH"
    1462359815.285621 [1 23.97.166.137:1277] "SET" "DEBUG" "2"
    1462359815.292618 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69.lock"
    1462359815.302945 [1 23.97.166.137:1277] "GET" "thisisourtest.c26119b3-cc3e-48f1-8834-6c7b1d3afb69"
    

    对我而言,看起来WATCH失败了(是否涉及Azure中的复制?)并且发布失败。 SE.Redis似乎没有DEL命令(从不介意MULTI/EXEC)。但是LockReleaseAsync没有报告此情况 - 我也无法在MONITOR日志中看到影响相关密钥的调用。

    难住了。

    关于如何进一步隔离这一点的任何想法?尝试构建一个小型测试用例并不是很快就能实现的。

    干杯!

0 个答案:

没有答案