使用Java中的Multiset中的子字符串计数项目

时间:2018-05-02 01:28:29

标签: java-stream

我在番石榴Multiset中有以下数据。每个项目是由':'分隔的3个项目的组合字符串。我知道每个插槽的所有值。我使用这些值为交互式图形生成数据文件(通过将拆分值填充到对象中,然后使用Gson打印对象)。

获取所有只匹配一个,一个:两个或一个:两个:三个子串的项目的累积计数的最佳方法是什么?我继续围绕着流,forEach,地图和过滤器,但似乎无法编写一组优雅的循环。任何建议或示例都会有所帮助。

执行:医疗保健:美国x 5

执行:医疗保健:马来西亚x 2

执行:财务:美国x 1

FinancialHealth:Technology:Malaysia x 3

FinancialHealth:Technology:United States x 2

FinancialHealth:能源:美国x 1

执行= 8

FinancialHealth = 6

执行:Heathcare = 7

执行:财务= 1

FinancialHealth:技术= 5

FinancialHealth:能量= 1

执行:医疗保健:美国= 5

等。

1 个答案:

答案 0 :(得分:1)

Streams在这里可以提供很多帮助,而且甚至都不困难。 我们需要在流中采取三个步骤:

allTheStrings.stream()
                  // First, we will multiply each string "A:B:C" using `flatMap`
                  // so that the stream contains "A", "A:B", and "A:B:C":
             .flatMap(s -> Stream.of(s.substring(0, s.indexOf(":")),
                                     s.substring(0, s.lastIndexOf(":")),
                                     s))
                  // next, we are going to summarize multiple occurrences
                  // of the strings using a groupingBy collector:
             .collect(Collectors.groupingBy(Function.identity(),
                 // This would return a Map<String, List<String>> containing each unique
                 // string mapped to its occurrences. But because you don't need the
                 // single occurrences, but instead just their number, we add a step
                 // to the collect which will make it return a Map<String, Long>
                                            Collectors.counting()))

所以,作为一个完整的例子:

Stream.of("Executive:Healthcare:United States", "Executive:Healthcare:United States",
          "Executive:Healthcare:United States", "Executive:Healthcare:United States",
          "Executive:Healthcare:United States", "Executive:Healthcare:Malaysia",
          "Executive:Healthcare:Malaysia", "Executive:Financials:United States",
          "FinancialHealth:Technology:Malaysia", "FinancialHealth:Technology:Malaysia",
          "FinancialHealth:Technology:Malaysia", "FinancialHealth:Technology:United States",
          "FinancialHealth:Technology:United States", "FinancialHealth:Energy:United States")
      .flatMap(s -> Stream.of(s.substring(0, s.indexOf(":")), s.substring(0, s.lastIndexOf(":")), s))
      .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
      .entrySet()
      .forEach(System.out::println);

将输出

Executive=8
Executive:Healthcare=7
FinancialHealth:Technology=5
FinancialHealth=6
FinancialHealth:Energy=1
FinancialHealth:Technology:Malaysia=3
FinancialHealth:Energy:United States=1
Executive:Healthcare:United States=5
Executive:Financials:United States=1
FinancialHealth:Technology:United States=2
Executive:Healthcare:Malaysia=2
Executive:Financials=1