Joining strings with limit

时间:2016-03-04 18:02:30

标签: java java-8 java-stream

Using only the standard Java library, what is a simple mechanism to join strings up to a limit, and append an ellipsis when the limit results in a shorter string?

Efficiency is desirable. Joining all the strings and then using String.substring() may consume excessive memory and time. A mechanism that can be used within a Java 8 stream pipeline is preferable, so that the strings past the limit might never even be created.

For my purposes, I would be happy with a limit expressed in either:

  • Maximum number of strings to join
  • Maximum number of characters in result, including any separator characters.

For example, this is one way to enforce a maximum number of joined strings in Java 8 with the standard library. Is there a simpler approach?

final int LIMIT = 8;

Set<String> mySet = ...;
String s = mySet.stream().limit( LIMIT ).collect( Collectors.joining(", "));
if ( LIMIT < mySet.size()) {
    s += ", ...";
}

3 个答案:

答案 0 :(得分:8)

您可以为此编写自定义收集器。这个基于another I wrote for a similar case

private static Collector<String, List<String>, String> limitingJoin(String delimiter, int limit, String ellipsis) {
    return Collector.of(
                ArrayList::new, 
                (l, e) -> {
                    if (l.size() < limit) l.add(e);
                    else if (l.size() == limit) l.add(ellipsis);
                },
                (l1, l2) -> {
                    l1.addAll(l2.subList(0, Math.min(l2.size(), Math.max(0, limit - l1.size()))));
                    if (l1.size() == limit) l1.add(ellipsis);
                    return l1;
                },
                l -> String.join(delimiter, l)
           );
}

在此代码中,我们保留了所有受欢迎的字符串ArrayList<String>。当一个元素被接受时,当前列表的大小将根据限制进行测试:严格小于它,添加元素;等于它,省略了省略号。对于组合器部分也是如此,这有点棘手,因为我们需要正确处理子列表的大小而不超过限制。最后,终结者只是使用给定的分隔符加入该列表。

此实现适用于并行Streams。它将保留Stream in encounter order的head元素。请注意,即使在达到限制后没有添加任何元素,它也会消耗Stream中的所有元素。

工作示例:

List<String> list = Arrays.asList("foo", "bar", "baz");
System.out.println(list.stream().collect(limitingJoin(", ", 2, "..."))); // prints "foo, bar, ..."

答案 1 :(得分:6)

虽然使用第三方代码不是提问者的选择,但对其他读者来说可能是可以接受的。即使编写自定义收集器,您仍然有一个问题:整个输入将被处理,因为标准收集器不能短路(特别是它不可能处理无限流)。我的StreamEx库增强了收藏家的概念,使得创建短路收集器成为可能。 Joining收集器也很容易提供:

StreamEx.of(mySet).collect( 
    Joining.with(", ").ellipsis("...").maxChars(100).cutAfterDelimiter() );

保证结果不超过100个字符。可以使用不同的计数策略:您可以通过字符,代码点或字形来限制(组合Unicode字符将不计算在内)。您也可以在任何位置(“First entry,second en ...”)或单词(“First entry,second ...”)或分隔符(“First entry,...”)之后剪切结果,或者在分隔符之前(“第一次进入,第二次进入...”)。它也适用于并行流,但在有序情况下可能效率不高。

答案 2 :(得分:1)

Using only the standard Java library

I don't believe there is anything in there that can do what you ask.

You need to write your own Collector. It won't be that complicated, so I don't see why writing your own would be an issue.