Question

试图找出是否有可能找到一个字符串中的匹配字符的第一个索引，该字符串也在另一个字符串中。例如：

String first = "test";
String second = "123er";
int value = get(test, other);
// method would return 1, as the first matching character in 
// 123er, e is at index 1 of test

所以我试图使用并行流来实现这一目标。我知道我可以找到一个匹配的字符，就像这样：

test.chars().parallel().anyMatch(other::contains);

我如何使用它来查找确切的索引？

Answer 1

如果你真的关心性能，你应该尽量避免为另一个字符迭代一个字符串的O(n × m)时间复杂度。因此，首先迭代一个字符串以获得支持高效（O(1)）查找的数据结构，然后利用此迭代迭代另一个字符串。

BitSet encountered = new BitSet();
test.chars().forEach(encountered::set);
int index = IntStream.range(0, other.length())
    .filter(ix->encountered.get(other.charAt(ix)))
    .findFirst().orElse(-1);

如果字符串足够大，此解决方案的O(n + m)时间复杂度将缩短到更短的执行时间。对于较小的琴弦，无论如何都是无关紧要的。

如果您真的认为，字符串足够大以便从并行处理中获益（这是非常不可能的），您可以并行执行这两个操作，只需很少的调整：

BitSet encountered = CharBuffer.wrap(test).chars().parallel()
    .collect(BitSet::new, BitSet::set, BitSet::or);
int index = IntStream.range(0, other.length()).parallel()
    .filter(ix -> encountered.get(other.charAt(ix)))
    .findFirst().orElse(-1);

第一个操作现在使用稍微复杂的并行兼容collect，它包含一个不太明显的Stream创建更改。

问题在bug report JDK-8071477中描述。简单地说，String.chars()返回的流具有较差的分裂能力，因此并行性能较差。上面的代码将字符串包装在CharBuffer中，其chars()方法返回不同的实现，具有相同的语义，但具有良好的并行性能。这种解决方法应该在Java 9中过时。

或者，您可以使用IntStream.range(0, test.length()).map(test::charAt)创建具有良好并行性能的流。第二次操作已经就是这样。

但是，正如所说的那样，对于这个特定的任务，你不太可能遇到足够大的字符串，使并行处理变得有益。

Answer 2

您可以依靠String#indexOf(int ch)执行此操作，仅保留values >= 0以删除不存在的字符，然后获取第一个值。

// Get the index of each characters of test in other
// Keep only the positive values
// Then return the first match
// Or -1 if we have no match
int result = test.chars()
    .parallel()
    .map(other::indexOf)
    .filter(i -> i >= 0)
    .findFirst()
    .orElse(-1);
System.out.println(result);

<强>输出：

NB 1：结果为1而不是2，因为索引从0开始而不是1。

注意2：除非你的String非常长，否则在这种情况下使用并行 Stream对于性能因为任务不复杂而且创建，启动和同步线程的成本非常高，所以你可能会比普通的流慢得多。

Answer 3

升级尼古拉斯的回答。 min()方法强制使用整个Stream。在这种情况下，最好使用findFirst()在找到第一个匹配元素后停止整个执行，而不是最小值：

test.chars().parallel()
  .map(other::indexOf)
  .filter(i -> i >= 0)
  .findFirst()
  .ifPresent(System.out::println);

使用并行流

3 个答案: