C#中字符串的最短子字符串

时间:2019-06-22 17:52:13

标签: c# string

我尝试编写程序,给定一个字符串,该字符串包含范围为ascii [a-z]的小写字母,并确定包含该字符串上所有字母的最小子字符串的长度。

但由于超时我被终止了。

我如何改善后置感?

我尝试过:

    public static int shortestSubstring(string s){
        int n = s.Length;
            int max_distinct = max_distinct_char(s, n);
            int minl = n;
            for (int i = 0; i < n; i++)
            {
                for (int j = 0; j < n; j++)
                {
                    String subs = null;
                    if (i < j)
                        subs = s.Substring(i, s.Length - j);
                    else
                        subs = s.Substring(j, s.Length - i);
                    int subs_lenght = subs.Length;
                    int sub_distinct_char = max_distinct_char(subs, subs_lenght);
                    if (subs_lenght < minl && max_distinct == sub_distinct_char)
                    {
                        minl = subs_lenght;
                    }
                }
            }
            return minl;
    }
        private static int max_distinct_char(String s, int n)
        {
            int[] count = new int[NO_OF_CHARS];
            for (int i = 0; i < n; i++)
                count[s[i]]++;

            int max_distinct = 0;
            for (int i = 0; i < NO_OF_CHARS; i++)
            {
                if (count[i] != 0)
                    max_distinct++;
            }
            return max_distinct;
        }

}


3 个答案:

答案 0 :(得分:1)

我认为有一个O(n)解决此问题的方法如下:

我们首先遍历字符串以找出其中有多少个不同的字符。此后,我们将两个指示子字符串的左索引和右索引的指针初始化为0。我们还保留了一个数组,用于计算子字符串中当前存在的每个字符的数目。如果不是所有字符都包含在内,则增加右指针以获取另一个字符。如果所有字符都包含在内,则增加左指针,以可能得到较小的子字符串。由于左指针或右指针在每一步都会增加,因此该算法应在O(n)时间内运行。

有关此算法的灵感,请参见Kadane's algorithm了解最大子数组问题。

不幸的是,我不知道C#。但是,我已经编写了Java解决方案(希望它具有相似的语法)。我没有对此进行严格的压力测试,因此有可能错过了一个边缘案例。

import java.io.*;
public class allChars {
    public static void main (String[] args) throws IOException {
        BufferedReader br = new BufferedReader (new InputStreamReader(System.in));
        String s = br.readLine();
        System.out.println(shortestSubstring(s));
    }
    public static int shortestSubstring(String s) {
        //If length of string is 0, answer is 0
        if (s.length() == 0) {
            return 0;
        }
        int[] charCounts = new int[26];
        //Find number of distinct characters in string
        int count = 0;
        for (int i = 0; i < s.length(); i ++) {
            char c = s.charAt(i);
            //If new character (current count of it is 0)
            if (charCounts[c - 97] == 0) {
                //Increase count of distinct characters
                count ++;
                //Increase count of this character to 1
                //Can put inside if statement because don't care if count is greater than 1 here
                //Only care if character is present
                charCounts[c - 97]++;
            }
        }
        int shortestLen = Integer.MAX_VALUE;
        charCounts = new int[26];
        //Initialize left and right pointers to 0
        int left = 0;
        int right = 0;
        //Substring already contains first character of string
        int curCount = 1;
        charCounts[s.charAt(0)-97] ++;
        while (Math.max(left,right) < s.length()) {
            //If all distinct characters present
            if (curCount == count) {
                //Update shortest length
                shortestLen = Math.min(right - left + 1, shortestLen);
                //Decrease character count of left character
                charCounts[s.charAt(left) - 97] --;
                //If new count of left character is 0
                if (charCounts[s.charAt(left) - 97] == 0) {
                    //Decrease count of distinct characters
                    curCount --;
                }
                //Increment left pointer to create smaller substring
                left ++;
            }
            //If not all characters present
            else {
                //Increment right pointer to get another character
                right ++;
                //If character is new (old count was 0)
                if (right < s.length() && charCounts[s.charAt(right) - 97]++ == 0) {
                    //Increment distinct character count
                    curCount ++;
                }
            }
        }
        return shortestLen;
    }
}

答案 1 :(得分:0)

我希望我理解正确,这是获取最小字符串的代码。

        string str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec dictum elementum condimentum. Aliquam commodo ipsum enim. Vivamus tincidunt feugiat urna.";
        char[] operators = { ' ', ',', '.', ':', '!', '?', ';' };
        string[] vs = str.Split(operators);
        string shortestWord = vs[0];
        for (int i = 0; i < vs.Length; i++)
        {
            if (vs[i].Length < shortestWord.Length && vs[i] != "" && vs[i] != " ")
            {
                shortestWord = vs[i];
            }
        }
        Console.WriteLine(shortestWord);

答案 2 :(得分:0)

这似乎是一个O(n^2)问题。这不理想;但是,我们可以做一些事情来避免测试不能成为有效候选者的子字符串。

我建议返回子字符串本身,而不是其长度。这有助于验证结果。

public static string ShortestSubstring(string input)

我们首先计算范围['a'..'z']中每个字符的出现。我们可以从字符中减去'a'以获得其从零开始的索引。

var charCount = new int[26];
foreach (char c in input) {
    charCount[c - 'a']++;
}

最短的子字符串等于输入中不同字符的数量。

int totalDistinctCharCount = charCount.Where(c => c > 0).Count();

要计算子字符串中不同字符的数量,我们需要以下布尔数组:

var hasCharOccurred = new bool[26];

现在,让我们测试从不同位置开始的子字符串。最大起始位置必须允许子字符串至少与totalDistinctCharCount(尽可能短的子字符串)一样长。

string shortest = input;
for (int start = 0; start <= input.Length - totalDistinctCharCount; start++) {
    ...
}
return shortest;

在此循环内,我们还有另一个循环计算子字符串的不同字符。注意,我们直接在输入字符串上工作,以避免创建很多新字符串。我们只需要测试比以前找到的任何最短子字符串都短的子字符串。因此,内部循环使用Math.Min(input.Length, start + shortest.Length - 1)作为限制。循环的内容(代替上段代码中的...):

    int distinctCharCount = 0;
    // No need to go past the length the previously found shortest.
    for (int i = start; i < Math.Min(input.Length, start + shortest.Length - 1); i++) {
        int chIndex = input[i] - 'a';
        if (!hasCharOccurred[chIndex]) {
            hasCharOccurred[chIndex] = true;
            distinctCharCount++;
            if (distinctCharCount == totalDistinctCharCount) {
                shortest = input.Substring(start, i - start + 1);
                break; // Found a shorter one, exit this inner loop.
            }
        }
    }

    // We cannot omit characters occurring only once
    if (charCount[input[start] - 'a'] == 1) {
        break; // Start cannot go beyond this point.
    }

    // Clear hasCharOccurred, to avoid creating a new array evey time.
    for (int i = 0; i < 26; i++) {
        hasCharOccurred[i] = false;
    }

进一步的优化是,只要在输入字符串(charCount[input[start] - 'a'] == 1中仅出现一次,则在开始位置遇到一个字符时,我们就立即停止。由于输入的每个不同字符都必须出现在子字符串中,因此该字符必须是子字符串的一部分。


我们可以使用以下命令在控制台中打印结果

string shortest = ShortestSubstring(TestString);
Console.WriteLine($"Shortest, Length = {shortest.Length}, \"{shortest}\"");