Question

我有一个句子的arraylist如下 -

List<String> allDocuments= new ArrayList<String>();
    list.add("my name is john what is your name");
    list.add("hello how are you");
    list.add("no name entered");
    list.add("who are you");

正如您在两个元素中看到的那样，“姓名”和“您”这个词出现了。如何获得每个单词出现的元素数量？所以最终结果将是

name = 2个元素

my = 1个元素

你= 2个元素

到目前为止，我坚持每个单词出现在单个元素中的次数，而不是每个单词有多少个元素。

List<String[]> list2 = new ArrayList<>();
        for (String s : allDocuments) {
            list2.add(s.split(" "));
        }
        ;
        for (String[] s : list2) {
        Map<String, Integer> wordCounts = new LinkedHashMap<String, Integer>();

        for (String word : s) {
            Integer count = wordCounts.get(word);
            if (count == null) {
                count = 0;
            }
            wordCounts.put(word, count + 1);
        }

        for (String key : wordCounts.keySet()) {

             System.out.println(key + ": " + wordCounts.get(key));

        }
    }

非常感谢帮助，谢谢！

Answer 1

Map<String, Integer> wordCounts = new HashMap<String, Integer>();

//making list of all words
for (String s : allDocuments)
  for ( String s2 : s.split(" "))
    if( ! wordCounts.containsKey(s2) )
        wordCounts.put(s2,0);

//counting occurence of all words in whole strings
for (String k : wordCounts.keySet())
  for (String s : allDocuments)
    if(s.indexOf(k) != -1)
      wordCounts.put(k, wordCounts.get(k)+1);

Answer 2

我希望这可以帮到你。我的代码有java 8语法：

 ArrayList<String> allDocuments = new ArrayList<String>();
    allDocuments.add("my name is john");
    allDocuments.add("hello how are you");
    allDocuments.add("no name entered");
    allDocuments.add("who are you");

    HashMap<String, Integer> words = new HashMap<>();

    for (String sentence : allDocuments) {
        String[] sentenceSpli = sentence.split(" ");
        for (String word : sentenceSpli) {
            //If my map contain the word I add 1 otherwise add it
            if (words.containsKey(word)) {
                words.put(word, words.get(word) + 1);
            } else {
                words.put(word, 1);
            }
        }
    }

    //Print result
    for (String key : words.keySet()) {
        System.out.println(key + " : " + words.get(key) + " time(s)");
    }

Answer 3

如果您想修复代码而不是完全重写代码，请按以下步骤操作：

首先，将每个文档的单词存储在Set s而不是数组中以防止重复：

List<Set<String>> list2 = new ArrayList<>();
for (String s : allDocuments) {
    list2.add(new HashSet<>(Arrays.asList(s.split(" "))));
}

然后只需移动wordCounts声明并在循环外打印，并将循环转换为Set<String>而不是String[]的迭代：<\ n / p>

Map<String, Integer> wordCounts = new LinkedHashMap<>();
for (Set<String> s : list2) {
    for (String word : s) {
        Integer count = wordCounts.get(word);
        if (count == null) {
            count = 0;
        }
        wordCounts.put(word, count + 1);
    }
}

for (String key : wordCounts.keySet()) {
    System.out.println(key + ": " + wordCounts.get(key));
}

现在输出正确：

what: 1
name: 2
is: 1
john: 1
your: 1
my: 1
how: 1
are: 2
hello: 1
you: 2
no: 1
entered: 1
who: 1

事实上，你离解决方案还远远不够; - ）

（请注意，wordCounts上的迭代可以通过迭代entrySet()来改进，但我并不想过多地修改代码。

Answer 4

遍历列表，然后用空格分隔每个句子。然后，遍历每个单词，看看单词是否与您要查找的内容相匹配。

List<String> allDocuments = new ArrayList<String>();
allDocuments.add("my name is john");
allDocuments.add("hello how are you");
allDocuments.add("no name entered");
allDocuments.add("who are you");

int name = 0, my = 0, you = 0;
for (String msg : allDocuments){
    for (String word : msg.split(" ")){
        if (word == "name"){
            name++;
        }
        if (word == "my"){
            my++;
        }
        if (word == "you"){
            you++;
        }
    }
}

Answer 5

创建一个地图，定义具有巧合的单词......类似于Map<String, Integer>

示例：

  public static void main(String[] args) {
    List<String> list = new ArrayList<>();
    list.add("my name is john");
    list.add("hello how are you");
    list.add("no name entered");
    list.add("who are you");
    System.out.println();
    System.out.println(processList(list));
    }

    private static Map<String, Integer> processList(List<String> list) {
    Map<String, Integer> coincidences = new HashMap<>();
    for (String string : list) {
        String[] sp = string.split(" ");
        for (String string2 : sp) {
        if (coincidences.get(string2) == null) {
            coincidences.put(string2, 1);
        } else {
            coincidences.put(string2, coincidences.get(string2) + 1);
        }
        }
    }
    return coincidences;
    }

这将给出如下地图：

{how = 1，no = 1，= 2，name = 2，is = 1，john = 1，hello = 1，输入= 1，my = 1，你= 2，谁= 1}

这是您需要的信息的最佳表示

Answer 6

通过您正在进行的拆分，List包含每个单词的所有实例。因此，我建议使用Set来存储要计算的单个词：

Set<String> words = new HashMap<>();
for (String s : allDocuments) {
    words.addAll(Arrays.asList(s.split(" ")));
}

然后使用此集合并遍历每个条目的allDocuments：

HashMap<String, Integer> wordcount = new HashSet<>();
for (String word : words) {
    int count = 0;
    for (String entry : allDocuments) { 
         if (entry.contains(word)) {
             count++;
        }
    }
    wordcount.put(word, count);
}

我现在没有可能对此进行测试，但类似的事情应该可以解决问题。

迎接

如何计算具有特定术语/单词的arraylist元素的数量？

6 个答案: