字符串1中的最小窗口包含字符串2中的所有字符,而字符串3中没有字符

时间:2014-04-22 19:55:16

标签: java string algorithm language-agnostic

好的,这是一个面试问题。不,它不是this question的副本。

给出3个字符串 - str1str2str3

str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

我们在str1中找到了最小窗口,其中包含str2中任何顺序的所有字符,但str3中没有字符。在这种情况下,答案是:"strup"

我已经提出了这段代码:

static String minimumWindow(String str1, String str2, String str3) {

        class Window implements Comparable<Window> {
            int start;
            int end;

            public Window(int start, int end) {
                this.start = start;
                this.end = end;
            }

            public int getEnd() {
                return end;
            }

            public int getStart() {
                return start;
            }

            public int compareTo(Window o) {
                int thisDiff = end - start;
                int thatDiff = o.end - o.start;

                return Integer.compare(thisDiff, thatDiff);
            }

            @Override
            public String toString() {
                return "[" + start + " : " + end + "]";
            }
        }

        // Create Sets of characters for "contains()" check

        Set<Character> str2Chars = new HashSet<>();
        for (char ch: str2.toCharArray()) {
            str2Chars.add(ch);
        }

        Set<Character> str3Chars = new HashSet<>();
        for (char ch: str3.toCharArray()) {
            str3Chars.add(ch);
        }

        // This will store all valid window which doesn't contain characters
        // from str3.
        Set<Window> set = new TreeSet<>();

        int begin = -1;

        // This loops gets each pair of index, such that substring from 
        // [start, end) in each window doesn't contain any characters from str3
        for (int i = 0; i < str1.length(); i++) {
            if (str3Chars.contains(str1.charAt(i))) {
                 set.add(new Window(begin, i));
                 begin = i + 1;
            }
        }

        int minLength = Integer.MAX_VALUE;
        String minString = "";

        // Iterate over the windows to find minimum length string containing all
        // characters from str2
        for (Window window: set) {
            if ((window.getEnd() - 1 - window.getStart()) < str2.length()) {
                continue;
            }

            for (int i = window.getStart(); i < window.getEnd(); i++) {
                if (str2Chars.contains(str1.charAt(i))) {
                      // Got first character in this window that is in str2
                      // Start iterating from end to get last character
                      // [start, end) substring will be the minimum length
                      // string in this window
                     for (int j = window.getEnd() - 1; j > i; j--) {
                        if (str2Chars.contains(str1.charAt(j))) {
                            String s = str1.substring(i, j + 1);

                            Set<Character> sChars = new HashSet<>();
                            for (char ch: s.toCharArray()) {
                                sChars.add(ch);
                            }

                            // If this substring contains all characters from str2, 
                            // then only it is valid window.
                            if (sChars.containsAll(str2Chars)) {
                                int len = sChars.size();
                                if (len < minLength) {
                                    minLength = len;
                                    minString = s;
                                }
                            }
                        }
                    }
                }
            }
        }

    // There are cases when some trailing and leading characters are
    // repeated somewhere in the middle. We don't need to include them in the
    // minLength. 
    // In the given example, the actual string would come as - "rstrup", but we
    // remove the first "r" safely.
        StringBuilder strBuilder = new StringBuilder(minString);

        while (strBuilder.length() > 1 && strBuilder.substring(1).contains("" + strBuilder.charAt(0))) {
            strBuilder.deleteCharAt(0);
        }

        while (strBuilder.length() > 1 && strBuilder.substring(0, strBuilder.length() - 1).contains("" + strBuilder.charAt(strBuilder.length() - 1))) {
            strBuilder.deleteCharAt(strBuilder.length() - 1);
        }

        return strBuilder.toString();
    }

但它并不适用于所有测试用例。它确实适用于此问题中给出的示例。但是当我提交代码时,它失败了2个测试用例。不,我不知道它失败的测试用例。

即使在尝试了各种样本输入之后,我也无法找到失败的测试用例。有人可以看看代码有什么问题吗?如果有人可以提供更好的算法(仅使用伪代码),我将非常感激。我知道这不是优化的解决方案。

3 个答案:

答案 0 :(得分:2)

str1 = "spqrstrupvqw"
str2 = "sprt"
str3 = "q"

我们正在寻找str1中包含所有str2个字符(假设已订购)的最小子字符串,而str3中没有字符。

i = 1 .. str1.length
cursor = 1 .. str2.length

解决方案必须在表格中:

str2.first X X .. X X str2.last

因此,要检查该子字符串,我们使用str2上的光标,但我们也有避免str3字符的约束,因此我们有:

if str3.contain(str1[i])
    cursor = 1
else
    if str1[i] == str2[cursor]
        cursor++

目标检查是:

if cursor > str2.length
    return solution
else
    if i >= str1.length
        return not-found

为了优化,您可以跳到下一个预测:

look-ahead = { str2[cursor] or { X | X in str3 }}

如果str2 未订购

i = 1 .. str1.length
lookup = { X | X in str2 }

解决方案必须在表格中:

str2[x] X X .. X X str2[x]

因此,要检查该子字符串,我们使用检查列表str2,但我们也有避免str3字符的约束,因此我们有:

if str3.contain(str1[i])
    lookup = { X | X in str2 }
else
    if lookup.contain(str1[i])
        lookup.remove(str1[i])

目标检查是:

if lookup is empty
    return solution
else
    if i >= str1.length
        return not-found

为了优化,您可以跳到下一个预测:

look-ahead = {{ X | X in lookup } or { X | X in str3 }}

<强>代码

class Solution
{
    private static ArrayList<Character> getCharList (String str)
    {
        return Arrays.asList(str.getCharArray());
    }

    private static void findFirst (String a, String b, String c)
    {
        int cursor = 0;
        int start = -1;
        int end = -1;

        ArrayList<Character> stream = getCharList(a);
        ArrayList<Character> lookup = getCharList(b);
        ArrayList<Character> avoid = getCharList(c);

        for(Character ch : stream)
        {
            if (avoid.contains(ch))
            {
                lookup = getCharList(b);
                start = -1;
                end = -1;
            }
            else
            {
                if (lookup.contains(ch))
                {
                    lookup.remove(ch)

                    if (start == -1) start = cursor;

                    end = cursor;
                }
            }

            if (lookup.isEmpty())
                break;

            cursor++;
        }

        if (lookup.isEmpty())
        {
            System.out.println(" found at ("+start+":"+end+") ");
        }
        else
        {
            System.out.println(" not found ");
        }
    }
}

答案 1 :(得分:1)

以下是working Java code在各种测试用例上的测试。

该算法基本上使用滑动窗口来检查答案可能存在的不同窗口。字符串str2中的每个字符最多分析两次。因此算法的运行时间是线性的,即三个字符串长度的O(N)。这实际上是解决此问题的最佳解决方案。

String str1 = "spqrstrupvqw";
String str2 = "sprt";
String str3 = "q";
char[] arr = str1.toCharArray();
HashSet<Character> take = new HashSet<Character>();
HashSet<Character> notTake = new HashSet<Character>();
HashMap<Character, Integer> map = new HashMap<Character, Integer>();

void run()throws java.lang.Exception{
    System.out.println(str1 + " " + str2 + " " + str3);

    //Add chars of str2 to a set to check if a char has to be taken in O(1)time.
    for(int i=0; i<str2.length(); i++){
        take.add(str2.charAt(i));
    }

    //Add chars of str3 to a set to check if a char shouldn't be taken in O(1) time.
    for(int i=0; i<str3.length(); i++){
        notTake.add(str3.charAt(i));
    }

    int last = -1;
    int bestStart = -1;
    int bestLength = arr.length+1;

    // The window will be from [last....next]

    for(int next=last+1; next<arr.length; next++){
        if(notTake.contains(arr[next])){ 
            last = initLast(next+1); //reinitialize the window's start. 
            next = last;
        }else if(take.contains(arr[next])){
            // take this character in the window and update count in map.
            if(last == -1){
                last = next;
                map.put(arr[last], 1);
            }else{
                if(!map.containsKey(arr[next])) map.put(arr[next], 1);
                else          map.put(arr[next], map.get(arr[next])+1);
            }
        }

        if(last >= arr.length){ // If window is invalid
            break;
        }

       if(last==-1){
            continue;
        }

        //shorten window by removing chars from start that are already present.
        while(last <= next){
            char begin = arr[last];

            // character is not needed in the window, ie not in set "take"
            if(!map.containsKey(begin)){
                last++;
                continue;
            }

            // if this character already occurs in a later part of the window
            if(map.get(begin) > 1){
                last++;
                map.put(begin, map.get(begin)-1);
            }else{
                break;
            }
        }

        // if all chars of str2 are in window and no char of str3 in window, 
// then update bestAnswer
        if(map.size() == str2.length()){
            int curLength = next - last + 1;
            if(curLength < bestLength){
                bestLength = curLength;
                bestStart  = last;
            }
        }
    }

    if(bestStart==-1){
        System.out.println("there is no such window");
    }else{
        System.out.println("the window is from " + bestStart + " to " + (bestStart + bestLength-1));
        System.out.println("window " + str1.substring(bestStart, bestStart+bestLength));
    }

}

// Returns the first position in arr starting from index 'fromIndex'
// such that the character at that position is in str2.
int initLast(int fromIndex){

    // clear previous mappings as we are starting a new window
    map.clear();
    for(int last=fromIndex; last<arr.length; last++){
        if(take.contains(arr[last])){
            map.put(arr[last], 1);
            return last;
        }
    }
    return arr.length;
}

此外,您的代码在许多琐碎的测试用例中失败了。其中一个是str1 =“abc”,str2 =“ab”,str3 =“c”。

PS。如果您很难理解此代码,请首先尝试阅读this easier post,这与所提出的问题非常相似。

答案 2 :(得分:1)

使用正则表达式怎么样?

String regex = ".*((?=[^q]*s)(?=[^q]*p)(?=[^q]*r)(?=[^q]*t)[sprt][^q]+([sprt])(?<!ss|pp|rr|tt))";

Matcher m = Pattern.compile(regex).matcher("spqrstrupvqw");

while (m.find()) {
    System.out.println(m.group(1));
}

打印出来:

strup

这也可以包含在动态生成变量输入的正则表达式的方法中:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class MatchString {
    public static void main(String[] args) {
        System.out.println(getMinimalSubstrings("spqrstrupvqw", "sprt", "q"));
        System.out.println(getMinimalSubstrings("A question should go inside quotations.", "qtu", "op"));
        System.out.println(getMinimalSubstrings("agfbciuybfac", "abc", "xy"));
    }

    private static List<String> getMinimalSubstrings(String input, String mandatoryChars, String exceptChars) {
        List<String> list = new ArrayList<String>();
        String regex = buildRegEx(mandatoryChars, exceptChars);

        Matcher m = Pattern.compile(regex).matcher(input);

        while (m.find()) {
            list.add(m.group(1));
        }

        return list;
    }

    private static String buildRegEx(String mandatoryChars, String exceptChars) {
        char[] mandChars = mandatoryChars.toCharArray();
        StringBuilder regex = new StringBuilder("[^").append(exceptChars).append("]*(");

        for (char c : mandChars) {
            regex.append("(?=[^").append(exceptChars).append("]*").append(c).append(")");
        }

        regex.append("[").append(mandatoryChars).append("][^").append(exceptChars).append("]+([").append(mandatoryChars).append("])(?<!");

        for (int i = 0; i < mandChars.length; i++) {
            if (i > 0) {
                regex.append("|");
            }

            regex.append(mandChars[i]).append(mandChars[i]);
        }

        regex.append("))");

        return regex.toString();
    }
}

打印出来:

[strup]
[quest]
[agfbc, bfac]