我正在尝试从推文中删除停用词,我首先添加令牌,然后循环遍历它们以查看它们是否与禁用词集中的单词匹配,如果是,则删除它们。我正在获得Java ConcurrentModificationErorr 。这是一个片段。
while ((line = br.readLine()) != null) {
//store tweet splits
LinkedHashSet<String> tweets = new LinkedHashSet<String>();
//We need to extract tweet and their constituent words
String [] tweet = line.split(",");
String input =tweet[1];
String [] constituent = input.split(" ");
//add all tokens in set
for (String a : constituent) {
tweets.add(a.trim());
}
System.out.println("Before: "+tweets);
//replace stopword
for (String word : tweets) {
if (stopwords.contains(word)) {
tweets.remove(word);
}
}
System.out.println("After: "+tweets);
//System.out.println("Tweet: "+sb.toString());
答案 0 :(得分:1)
for (String word : tweets) {
if (stopwords.contains(word)) {
tweets.remove(word);
}
}
上面的代码导致并发修改异常,因为在迭代时修改集合以避免使用如下所示
for(String word : new HashSet<String>(tweets)) {
if (stopwords.contains(word)) {
tweets.remove(word);
}
}
答案 1 :(得分:0)
我使用重复的LinkedHashSet解决了它。
LinkedHashSet<String> tweets_set = new LinkedHashSet<String>(tweets);
System.out.println("Before: "+tweets);
//replace stopword
for (String word : tweets_set) {
if (stopwords.contains(word)) {
tweets.remove(word);
}
}