我有两个文件,File1.txt和File2.txt。两个文件都包含文本。我想知道这些文件中存在的常用单词总数。我使用此代码获得了每个文件中的单词总数。
public int get_Total_Number_Of_Words(File file) {
try {
Scanner sc = new Scanner(new FileInputStream(file));
int count = 0;
while (sc.hasNext()) {
sc.next();
count++;
}
return count;
} catch (Exception e) {
e.printStackTrace();
}
return 0;
}
请告诉我如何使用此代码计算两个文件之间的常用词。
答案 0 :(得分:1)
使用Map实现。将单词作为键,将Integer作为每当找到键时增加的值。瞧!
public static void main(String[] args) {
String[] wordList = new String[]{"test1","test2","test1","test3","test1", "test2", "test4"};
Map<String, Integer> countMap = new HashMap<String, Integer>();
for (String word : wordList) {
if (countMap.get(word)==null) {
countMap.put(word, 1);
}
else {
countMap.put(word, countMap.get(word)+1);
}
}
System.out.println(countMap);
}
结果是:
{test4=1, test2=2, test3=1, test1=3}
答案 1 :(得分:1)
以下是使用Java 8和a project of mine的解决方案:
private static final Pattern WORDS = Pattern.compile("\\s+");
final LargeTextFactory factory = LargeTextFactory.defaultFactory();
final Path file1 = Paths.get("pathtofirstfile");
final Path file2 = Paths.get("pathtosecondfile");
final List<String> commonWords;
try (
final LargeText t1 = factory.fromPath(file1);
final LargeText t2 = factory.fromPath(file2);
) {
final Set<String> seen = new HashSet<>();
final Stream<String> all
= Stream.concat(WORDS.splitAsStream(t1), WORDS.splitAsStream(t2));
commonWords = all.filter(s -> { return !seen.add(s); })
.collect(Collectors.toList());
}
// commonWords contains what you want
如果您选择使用Set
的并发实现,也可以并行化。
答案 2 :(得分:0)
我会创建2个列表并将一个文本文件中的单词添加到一个列表中,然后将另一个文本文件中的单词添加到另一个列表中,然后比较两个单词并计算相同的单词。
答案 3 :(得分:0)
你必须进行某种比较。所以你可以使用嵌套循环来完成它。
String word1, word2;
int numCommon = 0;
try {
Scanner sc = new Scanner(new FileInputStream(file));
Scanner sc2 = new Scanner(new FileInputStream(file2));
while (sc.hasNext()) {
word1 = sc.next();
while(sc2.hasNext()){
word2 = sc2.next();
if(word2.equals(word1))
numCommon++;
}
}
return numCommon;
} catch (Exception e) {
e.printStackTrace();
}
return 0;