查找包含所有不同输入字符串的连续子系列的最小长度

时间:2015-11-20 08:52:08

标签: algorithm hash

我有以下问题,我不确定如何处理。我想为设计一个有效的算法以满足以下要求提供一些帮助/提示

输入 输入的第一行包含整数N,它是系列的长度。 接下来是N行,每行包含一个仅包含小写字符的字符串。 1·; = N< = 100000。 每个字符串的长度介于1和10之间(包括)。

输出 输出包含所有不同字符串的连续子系列的最小长度。

示例输入

6 随它去 mihon mihon OMI OMI letitbe

示例输出

18

说明:最后4个连续字符串包含所有具有最小长度(最小字符数)的唯一字符串

1 个答案:

答案 0 :(得分:1)

如果我理解正确,你需要子系列:

  1. 至少包含1个“letitbe”,“mihon”和“omi”
  2. 的实例
  3. 具有尽可能低的字符串长度总和
  4. 以下是如何有效地执行此操作,使用C#编写代码,在注释中解释算法:

        static void Main(string[] args)
        {
            // Input
            var elements = new List<string> { "letitbe", "mihon", "mihon", "omi", "omi", "letitbe" };
    
            // Find distinct elements
            var distinctElements = elements.Distinct().ToList();
    
            // Create a dictionary that tells us how many copies of each element we have in the current subseries, initialize all values to 0
            var copiesOfElementInCurrentSubseries = distinctElements.ToDictionary(key => key, value => 0);
    
            // The sum of lengths of strings in the current subseries
            // Our goal is to minimize this
            var lengthOfCurrentSubseries = 0;
    
            // How many distinct elements are covered by the current subseries
            // The condition under which we minimize lengthOfCurrentSubseries is that numberOfElementsCoveredByCurrentSubseries equals distinctElements
            var numberOfElementsCoveredByCurrentSubseries = 0;
    
            // We remember the solution in these
            var bestStartIndex = 0;
            var bestLength = elements.Sum(e => e.Length);
            var bestNum = elements.Count;
    
            // Start with startIndex and endIndex at 0, increase endIndex until we cover all distinct elements
            // The subseries from startIndex to endIndex (inclusive) is our current subseries
            for (int startIndex = 0, endIndex = 0; endIndex < elements.Count; endIndex++)
            {
                // We add the element at endIndex to our current subseries:
    
                // If we found an element that previously wasn't covered, increase the count of covered elements
                // Note that we never decrease this, because once we find a solution that covers all elements, we never make a change which "loses" some element
                if (copiesOfElementInCurrentSubseries[elements[endIndex]] == 0)
                {
                    numberOfElementsCoveredByCurrentSubseries++;
                }
                // Increase the number of copies of the element we added
                copiesOfElementInCurrentSubseries[elements[endIndex]]++;
                // Increase the total length of subseries by this element's length
                lengthOfCurrentSubseries += elements[endIndex].Length;
    
                // Initially, we will just loop increasing endIndex until all elements are covered
                // Once we are covering all elements, try to improve the solution
                if (numberOfElementsCoveredByCurrentSubseries == distinctElements.Count)
                {
                    // Move startIndex to the right as far as possible while still covering all elements
                    while (copiesOfElementInCurrentSubseries[elements[startIndex]] > 1)
                    {
                        lengthOfCurrentSubseries -= elements[startIndex].Length;
                        copiesOfElementInCurrentSubseries[elements[startIndex]]--;
                        startIndex++;
                    }
    
                    // If the new solution is better, remember it
                    if (lengthOfCurrentSubseries < bestLength)
                    {
                        bestLength = lengthOfCurrentSubseries;
                        bestStartIndex = startIndex;
                        bestNum = endIndex - startIndex + 1;
                    }
                }
    
                // Now we add another element by moving endIndex one place to the right, then try improving the solution by moving startIndex to the right, and we repeat this process...
            }
    
            Console.WriteLine(string.Join(" ", elements.Skip(bestStartIndex).Take(bestNum)));
        }
    

    请注意,即使这有嵌套循环,内部while循环在内部循环的所有传递中总共最多可以有length of input个步骤,因为startIndex保持其值和总是向右移动。

    如果你不熟悉C# - Dictionary基本上是一个哈希表 - 它可以有效地查找基于键的值(只要键具有良好的哈希函数,哪些字符串可以)。