Question

我生成一个500个字符的随机字符串，想要检查单词。

bliduuwfhbgphwhsyzjnlfyizbjfeeepsbpgplpbhaegyepqcjhhotovnzdtlracxrwggbcmjiglasjvmscvxwazmutqiwppzcjhijjbguxfnduuphhsoffaqwtmhmensqmyicnciaoczumjzyaaowbtwjqlpxuuqknxqvmnueknqcbvkkmildyvosczlbnlgumohosemnfkmndtiubfkminlriytmbtrzhwqmovrivxxojbpirqahatmydqgulammsnfgcvgfncqkpxhgikulsjynjrjypxwvlkvwvigvjvuydbjfizmbfbtjprxkmiqpfuyebllzezbxozkiidpplvqkqlgdlvjbfeticedwomxgawuphocisaejeonqehoipzsjgbfdatbzykkurrwwtajeajeornrhyoqadljfjyizzfluetynlrpoqojxxqmmbuaktjqghqmusjfvxkkyoewgyckpbmismwyfebaucsfueuwgio

我导入Dictionary Words txt file并检查string以查看它是否包含每个单词。如果找到匹配项，则会将其添加到列表中。

对于单词列表，我使用Dictionary<>阅读的速度比Array快。

当我使用该方法时，我可以看到cpu在调试器中运行foreach循环，并且我的循环计数器在10秒内上升了大约10,000+次，但循环继续进行并且不会返回任何结果。

当我使用Array作为词典时，该程序可以正常工作，但在10秒内会减慢约500次。

不工作

使用Dictionary<>

// Random Message
public string message = Random(500);

// Dictionary Words Reference
public Dictionary<string, string> dictionary = new Dictionary<string, string>();

// Matches Found
public static List<string> matches = new List<string>();


public MainWindow()
{
    InitializeComponent();

    // Import Dictionary File
    dictionary = File
                    .ReadLines(@"C:\dictionary.txt")
                    .Select((v, i) => new { Index = i, Value = v })
                    .GroupBy(p => p.Index / 2)
                    .ToDictionary(g => g.First().Value, g => g.Last().Value);


    // If Message Contains word, add to Matches List
    foreach (KeyValuePair<string, string> entry in dictionary)
    {
        if (message.Contains(entry.Value))
        {
            matches.Add(entry.Value);
        }
    }
}

工作

使用Array

// Random Message
public string message = Random(500);

// Dictionary Words Reference
public string[] dictionary = File.ReadAllLines(@"C:\dictionary.txt");

// Matches Found
public List<string> matches = new List<string>();


public MainWindow()
{
    InitializeComponent();

    // If Message Contains word, add to Matches List
    foreach (var entry in dictionary)
    {
        if (message.Contains(entry))
        {
            matches.Add(entry);
        }
    }
}

Answer 1

我怀疑您是否希望Dictionary<string, string>作为词典;）HashSet<string>就够了：

  using System.Linq;

  ...

  string source = "bliduuwfhbgphwhsyzjnlfyizbj";

  HashSet<string> allWords = new HashSet<string>(File
    .ReadLines(@"C:\dictionary.txt")
    .Select(line => line.Trim())
    .Where(line => !string.IsNullOrEmpty(line)), StringComparer.OrdinalIgnoreCase);

  int shortestWord = allWords.Min(word => word.Length);
  int longestWord = allWords.Max(word => word.Length);

  // If you want duplicates, change HashSet<string> to List<string>
  HashSet<string> wordsFound = new HashSet<string>(StringComparer.OrdinalIgnoreCase);

  for (int length = shortestWord; length <= longestWord; ++length) {
    for (int position = 0; position <= source.Length - length; ++position) {
      string extract = source.Substring(position, length);

      if (allWords.Contains(extract))
        wordsFound.Add(extract); 
    }
  }

测试：用于

https://raw.githubusercontent.com/dolph/dictionary/master/popular.txt

字典下载为C:\dictionary.txt文件

  Console.WriteLine(string.Join(", ", wordsFound.OrderBy(x => x)));

我们有输出

  id, li, lid

Answer 2

在这种情况下使用词典并没有多大意义。字典本质上是一个存储变量名和变量值的变量列表。

我可以拥有以下内容：

int age = 21;
int money = 21343;
int distance = 10;
int year = 2017;

使用以下内容将其转换为Dictionary：

Dictionary<string, int> numbers = new Dictionary<string, int>()
{
    { "age", 21 },
    { "money", 21343},
    { "distance", 10 },
    { "year", 2017 }
};

然后我可以使用其键（第一个值）访问字典中的值。所以，例如，如果我想知道什么＆＃34;年龄＆＃34;是的，我会用：

Console.Log(numbers["age"]);

这只是词典力量的一个例子 - 他们可以做的更多，他们可以让你的生活更轻松。然而，在这种情况下，他们不会做你期望他们做的事情。我建议只使用数组或List。

Answer 3

你在滥用字典，你基本上使用字典作为列表，所以它只增加了程序的一些开销。没有任何帮助。

如果你想要对字典进行查询而不是相反，那将是有用的。

此外，在任何情况下，你想要的是一个HashSet，而不是一个字典，因为你在字典中的键不是你要查询的单词，而是一个无关的int。

您可以在此处阅读有关字典和HashSet的更多信息：

字典： https://www.dotnetperls.com/dictionary
hashset： https://www.dotnetperls.com/hashset

如何导入词典文本文件并检查单词匹配？

3 个答案: