Question

我有以下问题，我不确定如何处理。我想为设计一个有效的算法以满足以下要求提供一些帮助/提示

输入输入的第一行包含整数N，它是系列的长度。接下来是N行，每行包含一个仅包含小写字符的字符串。 1·; = N＆LT; = 100000。每个字符串的长度介于1和10之间（包括）。

输出输出包含所有不同字符串的连续子系列的最小长度。

示例输入

6 随它去 mihon mihon OMI OMI letitbe

示例输出

18

说明：最后4个连续字符串包含所有具有最小长度（最小字符数）的唯一字符串

Answer 1

如果我理解正确，你需要子系列：

至少包含1个“letitbe”，“mihon”和“omi”
具有尽可能低的字符串长度总和

以下是如何有效地执行此操作，使用C＃编写代码，在注释中解释算法：

    static void Main(string[] args)
    {
        // Input
        var elements = new List<string> { "letitbe", "mihon", "mihon", "omi", "omi", "letitbe" };

        // Find distinct elements
        var distinctElements = elements.Distinct().ToList();

        // Create a dictionary that tells us how many copies of each element we have in the current subseries, initialize all values to 0
        var copiesOfElementInCurrentSubseries = distinctElements.ToDictionary(key => key, value => 0);

        // The sum of lengths of strings in the current subseries
        // Our goal is to minimize this
        var lengthOfCurrentSubseries = 0;

        // How many distinct elements are covered by the current subseries
        // The condition under which we minimize lengthOfCurrentSubseries is that numberOfElementsCoveredByCurrentSubseries equals distinctElements
        var numberOfElementsCoveredByCurrentSubseries = 0;

        // We remember the solution in these
        var bestStartIndex = 0;
        var bestLength = elements.Sum(e => e.Length);
        var bestNum = elements.Count;

        // Start with startIndex and endIndex at 0, increase endIndex until we cover all distinct elements
        // The subseries from startIndex to endIndex (inclusive) is our current subseries
        for (int startIndex = 0, endIndex = 0; endIndex < elements.Count; endIndex++)
        {
            // We add the element at endIndex to our current subseries:

            // If we found an element that previously wasn't covered, increase the count of covered elements
            // Note that we never decrease this, because once we find a solution that covers all elements, we never make a change which "loses" some element
            if (copiesOfElementInCurrentSubseries[elements[endIndex]] == 0)
            {
                numberOfElementsCoveredByCurrentSubseries++;
            }
            // Increase the number of copies of the element we added
            copiesOfElementInCurrentSubseries[elements[endIndex]]++;
            // Increase the total length of subseries by this element's length
            lengthOfCurrentSubseries += elements[endIndex].Length;

            // Initially, we will just loop increasing endIndex until all elements are covered
            // Once we are covering all elements, try to improve the solution
            if (numberOfElementsCoveredByCurrentSubseries == distinctElements.Count)
            {
                // Move startIndex to the right as far as possible while still covering all elements
                while (copiesOfElementInCurrentSubseries[elements[startIndex]] > 1)
                {
                    lengthOfCurrentSubseries -= elements[startIndex].Length;
                    copiesOfElementInCurrentSubseries[elements[startIndex]]--;
                    startIndex++;
                }

                // If the new solution is better, remember it
                if (lengthOfCurrentSubseries < bestLength)
                {
                    bestLength = lengthOfCurrentSubseries;
                    bestStartIndex = startIndex;
                    bestNum = endIndex - startIndex + 1;
                }
            }

            // Now we add another element by moving endIndex one place to the right, then try improving the solution by moving startIndex to the right, and we repeat this process...
        }

        Console.WriteLine(string.Join(" ", elements.Skip(bestStartIndex).Take(bestNum)));
    }

请注意，即使这有嵌套循环，内部while循环在内部循环的所有传递中总共最多可以有length of input个步骤，因为startIndex保持其值和总是向右移动。

如果你不熟悉C＃ - Dictionary基本上是一个哈希表 - 它可以有效地查找基于键的值（只要键具有良好的哈希函数，哪些字符串可以）。

查找包含所有不同输入字符串的连续子系列的最小长度

1 个答案: