Question

我正在尝试创建一个单行程序，它应该计算一个很长的文本文件中的唯一单词。例如，独特的单词是：máryafédorovnasraglet-liveried，...所以基本上都是非英语单词。

我的问题是我用我的代码过滤掉了不够的单词。我的代码：

    String text = readText("longlongtextfile"); // My own method for readText
    String[] words = text.split(" ");

    System.out.println("Initial word count: " + words.length);                                                              

    Stream <String> stream = Arrays.stream(words);
    long uniqueWords = stream.map(String::toLowerCase).distinct().count();

    System.out.println(uniqueWords);

我想应用.filter（i - ＆gt; i＆gt; ='a'＆amp;＆amp; i＆lt; ='z'）。distinct（）。count（）但这对字符串不起作用流。

所以我的问题是，如果字符串流有类似的a-Z过滤器

Answer 1

要计算包含root@vagrant-ubuntu-trusty-64:/# uname -r 3.13.0-107-generic root@vagrant-ubuntu-trusty-64:/# lsmod | grep overlay root@vagrant-ubuntu-trusty-64:/# root@vagrant-ubuntu-trusty-64:/# dockerd --storage-driver=overlay INFO[0000] libcontainerd: new containerd process, pid: 6816 WARN[0000] containerd: low RLIMIT_NOFILE changing to max current=1024 max=4096 ERRO[0001] 'overlay' not found as a supported filesystem on this host. Please ensure kernel is new enough and has overlay support loaded. Error starting daemon: error initializing graphdriver: driver not supported root@vagrant-ubuntu-trusty-64:/#以外字符的单词，可以使用正则表达式进行过滤：

a-z

要查找唯一令牌的数量，您需要计算它们发生的次数：

Arrays.stream(tokens).map(String::toLowerCase).filter(t -> !t.matches("[a-z]+")).distinct().count();

使用Stream <string> stream = Arrays.stream（words）;

1 个答案: