Question

我有一个关于提高程序效率的问题。我有一个字典＆lt; string，Thingey＆gt;定义为名为Thingeys。这是一个Web应用程序，随着时间的推移将创建多个名为Thingey的。 Thingey的创建成本有些高昂（并非过分夸大），但我希望尽可能避免使用它。我为这个请求获得正确的Thingey的逻辑看起来很像这样：

    private Dictionary<string, Thingey> Thingeys;
    public Thingey GetThingey(Request request)
    {
        string thingeyName = request.ThingeyName;
        if (!this.Thingeys.ContainsKey(thingeyName))
        {
            // create a new thingey on 1st reference
            Thingey newThingey = new Thingey(request);
            lock (this.Thingeys)
            {
                if (!this.Thingeys.ContainsKey(thingeyName))
                {
                    this.Thingeys.Add(thingeyName, newThingey);
                }
                // else - oops someone else beat us to it
                // newThingey will eventually get GCed
            }
        }

        return this. Thingeys[thingeyName];
    }

在这个应用程序中，Thingeys一旦创建就永远活着。我们不知道如何创建它们或者在应用程序启动和请求开始进入之前需要哪些。我在上面的代码中遇到的问题是偶尔出现newThingey的实例因为我们同时获得多个请求它在它被创建之前。我们最终创建了其中两个，但只添加了一个。有没有更好的方法来创建和添加Thingeys，不涉及检查/创建/锁定/检查/添加我们创建的罕见外来瘦，但最终永远不会使用？（而且这段代码有效并且已经运行了一段时间。这只是一直困扰着我的唠叨。）

我试图避免在创建Thingey期间锁定字典。

Answer 1

这是标准的double check locking问题。它在这里的实现方式是不安全的，并且可能导致各种问题 - 如果字典的内部状态搞得太糟糕，可能会在第一次检查时崩溃。

这是不安全的，因为你在没有同步的情况下检查它，如果你的运气很糟糕，你可以点击它而其他一些线程正在更新字典的内部状态

一个简单的解决方案是将第一张支票放在锁定之下。这样做的一个问题是，这成为全局锁定，在负载较重的Web环境中，它可能成为一个严重的瓶颈。

如果我们讨论的是.NET环境，可以通过搭载ASP.NET同步机制来解决这个问题。

以下是我在NDjango rendering engine中的做法：我为每个渲染线程保留一个全局字典和一个字典。当一个请求到来时，我首先检查本地字典 - 这个检查不必同步，如果那里有东西我只需要它

如果不是我在全局字典上进行同步，请检查它是否存在，以及是否将其添加到我的线程字典中并释放锁定。如果它不在全局词典中，我会先将它添加到那里，同时仍处于锁定状态。

Answer 2

嗯，从我的观点来看，更简单的代码更好，所以我只使用一个锁：

private readonly object thingeysLock = new object();
private readonly Dictionary<string, Thingey> thingeys;

public Thingey GetThingey(Request request)
{
    string key = request.ThingeyName;
    lock (thingeysLock)
    {
        Thingey ret;
        if (!thingeys.TryGetValue(key, out ret))
        {
            ret = new Thingey(request);
            thingeys[key] = ret;
        }
        return ret;
    }
}

当他们没有竞争时，锁是非常便宜的。缺点是，这意味着您在创建新Thingey的整个过程中偶尔会阻止所有人。显然，为避免创建冗余的thingeys，您必须至少阻止多个线程为同一个密钥创建Thingey。减少它以便它们仅阻止在那种情况下更难。

我建议您使用上面的代码，但对其进行分析以确定它是否足够快。如果你真的需要“只有在另一个线程已经创建相同的瘦身时才阻止”，那么告诉我们，我们将看到我们能做些什么......

编辑：你已经评论过Adam的答案，你“在创建一个新的Thingey时不想锁定” - 你确实意识到，如果争用相同的密钥，那就没有远离它了，对吧？如果线程1开始创建Thingey，则线程2请求相同的密钥，线程2的替代方案是等待或创建另一个实例。

编辑：好的，这通常很有趣，所以这里是“只阻止其他线程要求同一项目的第一遍”。

private readonly object dictionaryLock = new object();
private readonly object creationLocksLock = new object();
private readonly Dictionary<string, Thingey> thingeys;
private readonly Dictionary<string, object> creationLocks;

public Thingey GetThingey(Request request)
{
    string key = request.ThingeyName;
    Thingey ret;
    bool entryExists;
    lock (dictionaryLock)
    {
       entryExists = thingeys.TryGetValue(key, out ret);
       // Atomically mark the dictionary to say we're creating this item,
       // and also set an entry for others to lock on
       if (!entryExists)
       {
           thingeys[key] = null;
           lock (creationLocksLock)
           {
               creationLocks[key] = new object();          
           }
       }
    }
    // If we found something, great!
    if (ret != null)
    {
        return ret;
    }
    // Otherwise, see if we're going to create it or whether we need to wait.
    if (entryExists)
    {
        object creationLock;
        lock (creationLocksLock)
        {
            creationLocks.TryGetValue(key, out creationLock);
        }
        // If creationLock is null, it means the creating thread has finished
        // creating it and removed the creation lock, so we don't need to wait.
        if (creationLock != null)
        {
            lock (creationLock)
            {
                Monitor.Wait(creationLock);
            }
        }
        // We *know* it's in the dictionary now - so just return it.
        lock (dictionaryLock)
        {
           return thingeys[key];
        }
    }
    else // We said we'd create it
    {
        Thingey thingey = new Thingey(request);
        // Put it in the dictionary
        lock (dictionaryLock)
        {
           thingeys[key] = thingey;
        }
        // Tell anyone waiting that they can look now
        lock (creationLocksLock)
        {
            Monitor.PulseAll(creationLocks[key]);
            creationLocks.Remove(key);
        }
        return thingey;
    }
}

呼！

完全未经测试，特别是在创建线程中的异常情况下，它不会以任何方式，形状或形式强大......但我认为这是一般的正确想法：）

Answer 3

如果您希望避免阻止不相关的线程，则需要进行额外的工作（并且只有在您进行了分析并且发现使用更简单的代码时性能不可接受时才应该这样做）。我建议使用一个轻量级的包装类，它异步创建一个Thingey并在你的字典中使用它。

Dictionary<string, ThingeyWrapper> thingeys = new Dictionary<string, ThingeyWrapper>();

private class ThingeyWrapper
{
    public Thingey Thing { get; private set; }

    private object creationLock;
    private Request request;

    public ThingeyWrapper(Request request)
    {
        creationFlag = new object();
        this.request = request;
    }

    public void WaitForCreation()
    {
        object flag = creationFlag;

        if(flag != null)
        {
            lock(flag)
            {
                if(request != null) Thing = new Thingey(request);

                creationFlag = null;

                request = null;
            }
        }
    }
}

public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;

    ThingeyWrapper output;

    lock (this.Thingeys)
    {
        if(!this.Thingeys.TryGetValue(thingeyName, out output))
        {
            output = new ThingeyWrapper(request);

            this.Thingeys.Add(thingeyName, output);
        }
    }

    output.WaitForCreation();

    return output.Thing;
}

虽然您仍然锁定所有呼叫，但创建过程要轻得多。

修改

这个问题比我预期的更加困扰我，所以我把一个更强大的解决方案鞭在一起，遵循这个一般模式。你可以找到它here。

Answer 4

恕我直言，如果从多个线程同时调用这段代码，建议检查两次。

（但是：我不确定你是否可以安全地调用ContainsKey而其他一些线程正在调用Add。所以可能根本无法避免锁定。）

如果您只是想避免创建但未使用Thingy，只需在锁定块中创建它：

private Dictionary<string, Thingey> Thingeys;
public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;
    if (!this.Thingeys.ContainsKey(thingeyName))
    {
        lock (this.Thingeys)
        {
            // only one can create the same Thingy
            Thingey newThingey = new Thingey(request);
            if (!this.Thingeys.ContainsKey(thingeyName))
            {
                this.Thingeys.Add(thingeyName, newThingey);
            }

        }
    }

    return this. Thingeys[thingeyName];
}

Answer 5

你必须问自己一个问题：特定的ContainsKey操作和获取者是否是他们自己的线程安全（并且将在新版本中保持这种方式），因为那些可能会和将要调用< em> while 另一个线程已锁定字典并正在执行Add。

通常情况下，如果正确使用.NET锁是相当有效的，我相信在这种情况下你最好这样做：

bool exists;
lock (thingeys) {
    exists = thingeys.TryGetValue(thingeyName, out thingey);
}
if (!exists) {
    thingey = new Thingey();
}
lock (thingeys) {
    if (!thingeys.ContainsKey(thingeyName)) {
        thingeys.Add(thingeyName, thingey);
    }
}
return thingey;

Answer 6

您可能会以牺牲内存为代价来购买一点速度效率。如果您创建一个 immutable 数组，列出所有创建的Thingys并使用静态变量引用该数组，那么您可以在任何锁之外检查Thingy的存在，因为不可变数组始终是线程安全的。然后在添加新的Thingy时，您可以创建一个带有附加Thingy的新数组，并在一个（原子）set操作中替换它（在静态变量中）。由于竞争条件，可能会错过一些新的Thingys，但程序不应该失败。这只意味着在极少数情况下会发生额外的重复Thingys。

在创建新Thingy时，这不会取代重复检查的需要，并且它将使用大量内存资源，但在创建Thingy时不需要锁定或保留锁定。

我正在考虑这些方面的事情，排序：

private Dictionary<string, Thingey> Thingeys;
// An immutable list of (most of) the thingeys that have been created.
private string[] existingThingeys;

public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;
    // Reference the same list throughout the method, just in case another
    // thread replaces the global reference between operations.
    string[] localThingyList = existingThingeys;
    // Check to see if we already made this Thingey. (This might miss some, 
    // but it doesn't matter.
    // This operation on an immutable array is thread-safe.
    if (localThingyList.Contains(thingeyName))
    {
        // But referencing the dictionary is not thread-safe.
        lock (this.Thingeys)
        {
            if (this.Thingeys.ContainsKey(thingeyName))
                return this.Thingeys[thingeyName];
        }
    }
    Thingey newThingey = new Thingey(request);
    Thiney ret;
    // We haven't locked anything at this point, but we have created a new 
    // Thingey that we probably needed.
    lock (this.Thingeys)
    {
        // If it turns out that the Thingey was already there, then 
        // return the old one.
        if (!Thingeys.TryGetValue(thingeyName, out ret))
        {
            // Otherwise, add the new one.
            Thingeys.Add(thingeyName, newThingey);
            ret = newThingey;
        }
    }
    // Update our existingThingeys array atomically.
    string[] newThingyList = new string[localThingyList.Length + 1];
    Array.Copy(localThingyList, newThingey, localThingyList.Length);
    newThingey[localThingyList.Length] = thingeyName;
    existingThingeys = newThingyList; // Voila!
    return ret;
}

Answer 7

好吧，我希望不要天真地回答这个问题。但是我要做的是，因为Thingyes创建起来很昂贵，所以要添加一个空值的密钥。这就是这样的

private Dictionary<string, Thingey> Thingeys;
public Thingey GetThingey(Request request)
{
    string thingeyName = request.ThingeyName;
    if (!this.Thingeys.ContainsKey(thingeyName))
    {
        lock (this.Thingeys)
        {
            this.Thingeys.Add(thingeyName, null);
            if (!this.Thingeys.ContainsKey(thingeyName))
            {
                // create a new thingey on 1st reference
                Thingey newThingey = new Thingey(request);
                Thingeys[thingeyName] = newThingey;
            }
            // else - oops someone else beat us to it
            // but it doesn't mather anymore since we only created one Thingey
        }
    }

    return this.Thingeys[thingeyName];
}

我匆忙修改了你的代码所以没有进行任何测试无论如何，我希望我的想法不那么天真。：d

将项目添加到词典时如何避免双重检查锁定＆lt;＆gt; .NET中的对象？

7 个答案: