如何最有效地检查独特的字符?

时间:2015-11-02 00:24:24

标签: java arrays string

有人可以帮助我如何通过现有代码中的微小更改来检查唯一字符。您的回复将不胜感激。

public class UniqueStringCheck {
    public boolean checkUniqueString(String inputString){
        String parseString = inputString.replaceAll("[^A-Za-z0-9]","");
        int StringLength = parseString.length();

        char[] sequenceOfString= parseString.toCharArray();
        if(StringLength>0){
            if(sequenceOfString.equals("[^A-Za-z0-9]")){
                System.out.println("No unique char!");
                return false;
            }else{
                System.out.println("Contains Unique char");
                return true;
            }
        }
        return true;
    }
}

1 个答案:

答案 0 :(得分:1)

如果我正确地解释问题(由独特字符组成的字符串),有很多方法可以解决这个问题。这是四种不同效率的可能解决方案。根据需要进行修改。

import java.util.HashSet;
import java.util.Set;

public class IsUnique {

    /*With use of additional data structures
      Time complexity can be argued to be O(1) because the for-loop will never
      run longer than the size of the char-set (128, 256, or whatever UTF-8 is).
      Otherwise, time complexity is O(n), where n is the size of the input string.
      Space Complexity is O(1).
     */
    public boolean isUnique(String s) {
        //ExASCII = 256. Standard ASCII = 128. UTF-8 = ???
        if (s.length() > 256) {
            return false;
        }
        Set<Character> letterSet = new HashSet<Character>();
        for (int i = 0; i < s.length(); i++) {
            char c = s.charAt(i);
            if (!letterSet.contains(c)) {
                letterSet.add(c);
            } else {
                return false;
            }
        }
        return true;
    }

    /*Without the use of additional data structures
      Uses int as a bit mask.
      Assumption: letters a-z, lowercase only
      Time complexity, again, can be argued to be O(1) for the same reasons.
      If that argument is not accepted, O(n), where n is the size of the input
      string.
      Space Complexity: O(1), but it uses a smaller amount of space than the
      previous solution.
     */
    public boolean isUniqueWithoutSet(String s) {
        int check = 0;
        if (s.length() > 26) {
            return false;
        }
        for (int i = 0; i < s.length(); i++) {
            int val = s.charAt(i) - 'a';
            val = 1 << val;
            if ((check & val) > 0) {
                return false;
            }
            check |= val;
        }
        return true;
    }

    /*Without the use of additonal data structures.
      Sorts the underlying character array of the string, iterates through it and
      compares adjacent characters.
      Time complexity: O(nlogn). Arguably, O(n^2) if the java quick-sort hits its
      worst-case time (highly unlikely).
      Space complexity: Arguably, O(1), because the largest the array will ever be
      is the size of the character set. Otherwise, O(n), where n is the size of
      the input string.
     */
    public boolean badSolution(String s) {
        if (s.length() > 26) {
            return false;
        }
        char[] charArray = s.toCharArray();
        java.util.Arrays.sort(charArray);

        for (int i = 0, j = i + 1; i < charArray.length - 1; i++, j++) {
            if (charArray[i] == charArray[j]) {
                return false;
            }
        }
        return true;
    }

    /*This solution is terri-bad, but it works. Does not require additional data
      structures.
      Time complexity: O(n^2)
      Space complexity: O(1)/O(n).
     */
    public boolean worseSolution(String s) {
        if (s.length() > 256) {
            return false;
        }

        for (int i = 0; i < s.length() - 1; i++) {
            for (int j = i + 1; j < s.length(); j++) {
                if (s.charAt(i) == s.charAt(j)) {
                    return false;
                }
            }
        }
        return true;
    }

}

因此,让我们分解这些方法。

第一种方法使用Set检查字符串中的字符是否都是唯一的。

第二种方法是我个人的最爱。它使用位掩码来确保唯一性。我认为运行时效率与第一种方法相比要好8倍,但它们仍然具有相同的复杂性,对于大多数应用而言,差异可以忽略不计。

第三种方法对字符进行排序,并将每个字符与其邻居进行比较。

最后一种方法是对数组进行嵌套循环搜索。可怕,但它确实有效。