我尝试编写程序,给定一个字符串,该字符串包含范围为ascii [a-z]的小写字母,并确定包含该字符串上所有字母的最小子字符串的长度。
但由于超时我被终止了。
我如何改善后置感?
我尝试过:
public static int shortestSubstring(string s){
int n = s.Length;
int max_distinct = max_distinct_char(s, n);
int minl = n;
for (int i = 0; i < n; i++)
{
for (int j = 0; j < n; j++)
{
String subs = null;
if (i < j)
subs = s.Substring(i, s.Length - j);
else
subs = s.Substring(j, s.Length - i);
int subs_lenght = subs.Length;
int sub_distinct_char = max_distinct_char(subs, subs_lenght);
if (subs_lenght < minl && max_distinct == sub_distinct_char)
{
minl = subs_lenght;
}
}
}
return minl;
}
private static int max_distinct_char(String s, int n)
{
int[] count = new int[NO_OF_CHARS];
for (int i = 0; i < n; i++)
count[s[i]]++;
int max_distinct = 0;
for (int i = 0; i < NO_OF_CHARS; i++)
{
if (count[i] != 0)
max_distinct++;
}
return max_distinct;
}
}
答案 0 :(得分:1)
我认为有一个O(n)解决此问题的方法如下:
我们首先遍历字符串以找出其中有多少个不同的字符。此后,我们将两个指示子字符串的左索引和右索引的指针初始化为0。我们还保留了一个数组,用于计算子字符串中当前存在的每个字符的数目。如果不是所有字符都包含在内,则增加右指针以获取另一个字符。如果所有字符都包含在内,则增加左指针,以可能得到较小的子字符串。由于左指针或右指针在每一步都会增加,因此该算法应在O(n)时间内运行。
有关此算法的灵感,请参见Kadane's algorithm了解最大子数组问题。
不幸的是,我不知道C#。但是,我已经编写了Java解决方案(希望它具有相似的语法)。我没有对此进行严格的压力测试,因此有可能错过了一个边缘案例。
import java.io.*;
public class allChars {
public static void main (String[] args) throws IOException {
BufferedReader br = new BufferedReader (new InputStreamReader(System.in));
String s = br.readLine();
System.out.println(shortestSubstring(s));
}
public static int shortestSubstring(String s) {
//If length of string is 0, answer is 0
if (s.length() == 0) {
return 0;
}
int[] charCounts = new int[26];
//Find number of distinct characters in string
int count = 0;
for (int i = 0; i < s.length(); i ++) {
char c = s.charAt(i);
//If new character (current count of it is 0)
if (charCounts[c - 97] == 0) {
//Increase count of distinct characters
count ++;
//Increase count of this character to 1
//Can put inside if statement because don't care if count is greater than 1 here
//Only care if character is present
charCounts[c - 97]++;
}
}
int shortestLen = Integer.MAX_VALUE;
charCounts = new int[26];
//Initialize left and right pointers to 0
int left = 0;
int right = 0;
//Substring already contains first character of string
int curCount = 1;
charCounts[s.charAt(0)-97] ++;
while (Math.max(left,right) < s.length()) {
//If all distinct characters present
if (curCount == count) {
//Update shortest length
shortestLen = Math.min(right - left + 1, shortestLen);
//Decrease character count of left character
charCounts[s.charAt(left) - 97] --;
//If new count of left character is 0
if (charCounts[s.charAt(left) - 97] == 0) {
//Decrease count of distinct characters
curCount --;
}
//Increment left pointer to create smaller substring
left ++;
}
//If not all characters present
else {
//Increment right pointer to get another character
right ++;
//If character is new (old count was 0)
if (right < s.length() && charCounts[s.charAt(right) - 97]++ == 0) {
//Increment distinct character count
curCount ++;
}
}
}
return shortestLen;
}
}
答案 1 :(得分:0)
我希望我理解正确,这是获取最小字符串的代码。
string str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec dictum elementum condimentum. Aliquam commodo ipsum enim. Vivamus tincidunt feugiat urna.";
char[] operators = { ' ', ',', '.', ':', '!', '?', ';' };
string[] vs = str.Split(operators);
string shortestWord = vs[0];
for (int i = 0; i < vs.Length; i++)
{
if (vs[i].Length < shortestWord.Length && vs[i] != "" && vs[i] != " ")
{
shortestWord = vs[i];
}
}
Console.WriteLine(shortestWord);
答案 2 :(得分:0)
这似乎是一个O(n^2)
问题。这不理想;但是,我们可以做一些事情来避免测试不能成为有效候选者的子字符串。
我建议返回子字符串本身,而不是其长度。这有助于验证结果。
public static string ShortestSubstring(string input)
我们首先计算范围['a'..'z']中每个字符的出现。我们可以从字符中减去'a'
以获得其从零开始的索引。
var charCount = new int[26];
foreach (char c in input) {
charCount[c - 'a']++;
}
最短的子字符串等于输入中不同字符的数量。
int totalDistinctCharCount = charCount.Where(c => c > 0).Count();
要计算子字符串中不同字符的数量,我们需要以下布尔数组:
var hasCharOccurred = new bool[26];
现在,让我们测试从不同位置开始的子字符串。最大起始位置必须允许子字符串至少与totalDistinctCharCount
(尽可能短的子字符串)一样长。
string shortest = input;
for (int start = 0; start <= input.Length - totalDistinctCharCount; start++) {
...
}
return shortest;
在此循环内,我们还有另一个循环计算子字符串的不同字符。注意,我们直接在输入字符串上工作,以避免创建很多新字符串。我们只需要测试比以前找到的任何最短子字符串都短的子字符串。因此,内部循环使用Math.Min(input.Length, start + shortest.Length - 1)
作为限制。循环的内容(代替上段代码中的...
):
int distinctCharCount = 0;
// No need to go past the length the previously found shortest.
for (int i = start; i < Math.Min(input.Length, start + shortest.Length - 1); i++) {
int chIndex = input[i] - 'a';
if (!hasCharOccurred[chIndex]) {
hasCharOccurred[chIndex] = true;
distinctCharCount++;
if (distinctCharCount == totalDistinctCharCount) {
shortest = input.Substring(start, i - start + 1);
break; // Found a shorter one, exit this inner loop.
}
}
}
// We cannot omit characters occurring only once
if (charCount[input[start] - 'a'] == 1) {
break; // Start cannot go beyond this point.
}
// Clear hasCharOccurred, to avoid creating a new array evey time.
for (int i = 0; i < 26; i++) {
hasCharOccurred[i] = false;
}
进一步的优化是,只要在输入字符串(charCount[input[start] - 'a'] == 1
中仅出现一次,则在开始位置遇到一个字符时,我们就立即停止。由于输入的每个不同字符都必须出现在子字符串中,因此该字符必须是子字符串的一部分。
我们可以使用以下命令在控制台中打印结果
string shortest = ShortestSubstring(TestString);
Console.WriteLine($"Shortest, Length = {shortest.Length}, \"{shortest}\"");