检查字符串中是否存在字符集-改进

时间:2019-06-24 11:40:24

标签: java string algorithm

如果两个英语单词仅包含相同的字母,则它们是相似的。例如,食物和食物不相似,但是狗和食物相似。 (如果A与B相似,则A中的所有字母都包含在B中,而B中的所有字母都包含在A中。)

给出单词W和单词L的列表,找到L中所有与W相似的单词。将单词计数打印到标准输出。

示例:

输入(stdin):

love
velo low vole lovee volvell lowly lower lover levo loved love lovee lowe lowes lovey lowan lowa evolve loves volvelle lowed love

输出(标准输出):

14

说明:

L中与爱相似的词是 velo vole lovee volvell lover levo loved love lovee lovey evolve loves volvelle love

最多14

所以我当前的解决方案如下:

 public static void main(String[] args) {
    String[] arr = new String[]{"velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
            "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"};
    String s = "love";
    int result = 0;

    Pattern p = Pattern.compile(buildPattern(s));

    for (String val : arr) {
        if (p.matcher(val).find()) result++;
    }

    System.out.println(result);
}

private static String buildPattern(String s) {
    String pattern = "^";
    for (int i = 0; i < s.length(); i++) {
        pattern += "(?=.*" + s.charAt(i) + ")";
    }
    return pattern;
}

我想知道我的简单代码是否有任何改进。

Aho-Corasick是否适用解决方案?

7 个答案:

答案 0 :(得分:4)

由于只有26个字母,并且int中有32位,因此int足够大,可以容纳单词中出现哪些字母的所有信息:

static int getFingerprint(String s)
{
    int result=0;
    for (int i = s.length()-1; i>=0; --i) {
        char c = s.charAt(i);
        if (c>='a' && c<='z')
            result |= 1<<(int)(c-'a');
        else if (c>='A' && c<='Z')
            result |= 1<<(int)(c-'A');
    }
    return result;
}

public static void main(String[] args) {
    String[] arr = new String[]{"velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
        "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"};
    String s = "love";

    int fingerprint = getFingerprint(s);

    int matches = 0;
    for (String item : arr) {
        if (getFingerprint(item)==fingerprint)
            ++matches;
    }
    System.out.println(matches);
}

答案 1 :(得分:0)

我建议简化正则表达式,不需要先行,简单的“ ^ [love] * $”应该可以解决问题。

<input type="number" class="form-control" matInput name="value" placeholder="xxx" (change)="xxx()" formControlName="value">

答案 2 :(得分:0)

我会尽量避免使用正则表达式,而是自己检查字母。

public static void main(String[] args)
{
  String[] arr = new String[]{"velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
          "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"};
  String s = "love";
  int result = 0;

  for (String word : arr)
  {
    if (isSimilar(s, word))
    {
      result++;
    }
  }

  System.out.println(result);
}

private static boolean isSimilar(String word, String test)
{
  for (char c : test.toCharArray())
  {
    if (word.indexOf(c) == -1)
    {
      return false;
    }
  }
  return true;
}

虽然目前我上面的示例仅返回10

答案 3 :(得分:0)

在实现和手动检查时,我只算出应该成功的10个。

就像比较每个单词中的字母是否相等一样简单

public static void main(String... args)
{
    String word = "love";
    List<String> strs = Arrays.asList(
        "velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
        "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"
    );

    System.out.println(
        strs.stream()
           .filter(str -> chars(word).equals(chars(str)))
           .count()
    );
}

private static Set<Character> chars(String word)
{
    return word.chars()
        .mapToObj(ch -> (char) ch)
        .collect(Collectors.toSet());
}

答案 4 :(得分:0)

public static void main(String[] args) {
    String[] arr = new String[]{"velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
            "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"};
    String s = "love";

    Set<Character> searchWordCharacters = getDistinctCharacters(s);
    long result = Stream.of(arr)
            .map(Scratch::getDistinctCharacters)
            .filter(wordCharacters -> wordCharacters.size() == searchWordCharacters.size())
            .filter(wordCharacters -> wordCharacters.containsAll(searchWordCharacters))
            .peek(System.out::println)
            .count();
    System.out.println(result);
}

private static Set<Character> getDistinctCharacters(String word) {
    return word.chars()
            .mapToObj(i -> (char) i)
            .collect(Collectors.toSet());
}

结果:1​​0

答案 5 :(得分:0)

应成功计数10个!

import seaborn as sns
sns.barplot(x,y)

答案 6 :(得分:0)

import java.util.Arrays;

class SomeClass {
    public static void main(String[] args) {
        String[] arr = new String[]{"velo", "low", "vole", "lovee", "volvell", "lowly", "lower", "lover", "levo", "loved", "love",
                "lovee", "lowe", "lowes", "lovey", "lowan", "lowa", "evolve", "loves", "volvelle", "lowed", "love"};
        String s = "love";
        int count = 0;

        boolean[] characters_state = new boolean[26];
        Arrays.fill(characters_state, false);
        for(int i = 0; i < s.length(); i++) {
            characters_state[s.charAt(i) - 'a'] = true;
        }

        for(int i = 0; i < arr.length; i++) {
            if (check(arr[i], characters_state.clone())) {
                count++;
            }
        }
        System.out.println(count);
    }

    static boolean check(String s, boolean[] characters_state) {
        for(int i = 0; i < s.length(); i++) {
            if(!characters_state[s.charAt(i) - 'a']) {
                return false;
            }
        }
        return true;
    }
}

输出

10

real    0m0,210s
user    0m0,206s
sys 0m0,025s