case 15: {
for (int i = 0; i < words.Count; i++) {
if (words[i].Length == 8) {
var tupled = words[i].ConcatCheck();
for (int n = 0; n < word.Count; n++)
if (word[n] == tupled.Item1 || word[n] == tupled.Item2)
temp++;
}
if (temp >= 2)
matches.Add(words[i]);
temp = 0;
}
break;
}
它的作用:
第一个'for循环'遍历大约248000个元素的List
个单词,检查长度为8的单词。
当找到一个时,我通过调用Tuple
方法(我为obj String编写的扩展方法)创建单词的前半部分和后半部分ConcatCheck()
(每半个4个字母)。那部分既快又好。
真正需要的是第二个'for循环'。每个单个8个字母的单词激活此循环,循环遍历大约267000个元素的更大List
,检查Tuple
的两个项是否都存在。如果两者都找到,我将原始单词添加到列表“匹配”。
这部分需要将近3分钟才能找到我所拥有的248k词典中的所有匹配项。有什么方法可以优化/加速它?
答案 0 :(得分:2)
如果您只是想检查某个集合中是否存在某个字词,请使用HashSet
代替List
或Array
。 HashSet
类针对Contains
检查进行了优化。
示例强>
使用以下代码,我发现所有8个字母的单词由english dictionary (github version)中的两个4个字母单词组成,不到50毫秒。
WebClient client = new WebClient();
string dictionary = client.DownloadString(
@"https://raw.githubusercontent.com/dwyl/english-words/master/words.txt");
Stopwatch watch = new Stopwatch();
watch.Start();
HashSet<string> fourLetterWords = new HashSet<string>();
using (StringReader reader = new StringReader(dictionary))
{
while (true)
{
string line = reader.ReadLine();
if (line == null) break;
if (line.Length != 4) continue;
fourLetterWords.Add(line);
}
}
List<string> matches = new List<string>();
using (StringReader reader = new StringReader(dictionary))
{
while (true)
{
string line = reader.ReadLine();
if (line == null) break;
if (line.Length != 8) continue;
if (fourLetterWords.Contains(line.Substring(0, 4)) &&
fourLetterWords.Contains(line.Substring(4, 4)))
matches.Add(line);
}
}
watch.Stop();
为什么你的代码这么慢?
for (int n = 0; n < word.Count; n++)
if (word[n] == tupled.Item1 || word[n] == tupled.Item2)
temp++;
这部分是罪魁祸首之一。而不是检查Are both parts contained in my array?
,而是检查Are 2 or more of my 2 words contained in an array?
。
一旦找到两个单词,你可以通过打破循环来优化这个部分。
if (word[n] == tupled.Item1 || word[n] == tupled.Item2)
if(++temp >= 2) break;
可以通过按长度或按字母顺序对单词进行预先排序来进一步优化(取决于您运行此搜索的频率)。
答案 1 :(得分:-1)
O(n)使用字典:
IList<string> words1 = new List<string>{...};
var wordsWithLengthOf8 = words1.Where(w => w.Length == 8).ToList();
IDictionary<string,string> wordsWithLengthOf8Dic = wordsWithLengthOf8.ToDictionary(w => w);
IList<string> words2 = new List<string>{...};
IList<string> matches = new List<string>();
for (int i = 0; i < wordsWithLengthOf8.Count; i++)
{
var tupled = wordsWithLengthOf8[i].ConcatCheck();
var isMatch = wordsWithLengthOf8Dic.ContainsKey(tupled.Item1) && wordsWithLengthOf8Dic.ContainsKey(tupled.Item2);
if (isMatch)
{
matches.Add(wordsWithLengthOf8[i]);
}
}