从HashSet键列出的字典值

时间:2012-03-16 16:04:20

标签: .net linq dictionary hashset

希望评论解释这个问题。我有一个很大的静态(用程序加载的新数据构建)字典,它被大量使用,我想尽可能高效地引用它,而不是创建数据的副本和吃内存。有一个哈希集代表我需要表示为值的排序列表的字典的子集。新的HashSet或对HashSet的更改然后我构建了一个包装下面代码的新类。有没有更好的方法来解决这个问题?我无法弄清楚如何在LINQ中使用外部HashSet。

// FTSwordIDs is a hashset and is also used elsewhere - a lot - hashset for lookup speed
// dlFTSword is Dictionary<Int32, string> and is static and sorted by value (on load) 
// dlFTSword can contain over a million entries and is used a lot of places 
// need to refence it rather than build a new list and eat memory
words = new List<string>();
foreach(Int32 id in dlFTSword.Keys)
{
    if (FTSwordIDs.Contains(id)) words.Add(dlFTSword[id]);
}
return words;

时间是12毫秒

try
{
    sqlConRO1.Open();
    sqlCMDRO1.CommandText = "SELECT [ID], [word] FROM [FTSwordDef] WITH (NOLOCK); "; // ORDER BY [word]; ";
    SqlDataReader rdr = sqlCMDRO1.ExecuteReader();
    while (rdr.Read())
    {
        dlFTSword.Add(rdr.GetInt32(0), rdr.GetString(1));
    }
    rdr.Close();
    Debug.WriteLine("dlFTSword.Count = " + dlFTSword.Count.ToString());

}
catch (Exception Ex) { throw new Exception("InitializeData Failed " + Ex.Message); }
finally { sqlConRO1.Close(); }

HashSet<Int32> wordIDs = new HashSet<int>() { 1, 100000, 200000, 300000, 400000, 500000, 600000 };
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();

//List<string> words = dlFTSword.Where(pair => wordIDs.Contains(pair.Key))
//         .Select(pair => pair.Value)
//         .OrderBy(x => x)
//         .ToList();
//List<string> words = wordIDs.Where(key => dlFTSword.ContainsKey(key))
//          .Select(key => dlFTSword[key])
//          .OrderBy(value => value)
//          .ToList();
IEnumerable<string> words = wordIDs.Where(key => dlFTSword.ContainsKey(key))
          .Select(key => dlFTSword[key])
          .OrderBy(value => value);
DateTime dtEnd = DateTime.Now;
stopWatch.Stop();
TimeSpan ts = stopWatch.Elapsed;
Debug.WriteLine(ts.Milliseconds.ToString());

2 个答案:

答案 0 :(得分:1)

var words = dlFTSword.Keys.Where(FTSwordIDs.Contains).Select(x => dlFTSword[x])

答案 1 :(得分:1)

听起来像你想要的那样:

var words = dlFTSword.Where(pair => FTSwordIDs.Contains(pair.Key))
                     .Select(pair => pair.Value)
                     .ToList();

请注意,这可以避免在找到它之后对每个键执行查找。我们实际上并没有在字典中查找任何。它也不涉及创建任何其他集Intersect

这值得检查:

  

dlFTSword是Dictionary,是静态的,按值(在加载时)排序

如果是Dictionary<TKey, TValue>,那么不会排序。没有“排序”Dictionary<TKey, TValue>这样的概念。您可以使用SortedDictionary<,>SortedList<,>,但这些都与Dictionary<,>不同。

编辑:如果哈希集非常小,那么迭代那个比在词典中的每一对上更有意义:

var words = FTSwordIDs.Where(key => dlFTSword.ContainsKey(key))
                      .Select(key => dlFTSword[key])
                      .OrderBy(value => value)
                      .ToList();

必须进行查找有点难看,但速度要快得多。