我正在尝试,使用BufferedReader来计算.txt文件中字符串的外观。我正在使用:
File file = new File(path);
try {
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
int appearances = 0;
while ((line = br.readLine()) != null) {
if (line.contains("Hello")) {
appearances++;
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Found " + appearances);
但问题是,如果我的.txt文件包含例如字符串"Hello, world\nHello, Hello, world!"
和"Hello"
,那么外观将变为两个而不是三个,因为它只在一行中搜索的字符串。我怎么能解决这个问题?非常感谢
答案 0 :(得分:3)
最简单的解决方案是
while ((line = br.readLine()) != null)
appearances += line.split("Hello", -1).length-1;
请注意,如果不是" Hello",而是搜索regex-reserved characters的任何内容,您应该在拆分之前转义字符串:
String escaped = Pattern.quote("Hello."); // avoid '.' special meaning in regex
while ((line = br.readLine()) != null)
appearances += line.split(escaped, -1).length-1;
答案 1 :(得分:2)
这是一个高效而正确的解决方案:
String line;
int count = 0;
while ((line = br.readLine()) != null)
int index = -1;
while((index = line.indexOf("Hello",index+1)) != -1){
count++;
}
}
return count;
它遍历该行,并从上一个索引+ 1开始查找下一个索引。
彼得的解决方案的问题在于它是错误的(参见我的评论)。 TheLostMind解决方案的问题在于它通过替换创建了许多新字符串,这是一个不必要的性能缺陷。
答案 2 :(得分:1)
正则表达式驱动的版本:
String line;
Pattern p = Pattern.compile(Pattern.quote("Hello")); // quotes in case you need 'Hello.'
int count = 0;
while ((line = br.readLine()) != null)
for (Matcher m = p.matcher(line); m.find(); count ++) { }
}
return count;
我现在很想知道这个和gexicide的版本之间的表现 - 当我有结果时会编辑。
IndexFinder: 1579ms, 2407200hits. // gexicide's code
RegexFinder: 2907ms, 2407200hits. // this code
SplitFinder: 5198ms, 2407200hits. // Peter Lawrey's code, after quoting regexes
结论:对于非正则表达式字符串,repeat-indexOf方法的速度最快。
基本基准代码(来自vanilla Ubuntu 12.04安装的日志文件):
public static void main(String ... args) throws Exception {
Finder[] fs = new Finder[] {
new SplitFinder(), new IndexFinder(), new RegexFinder()};
File log = new File("/var/log/dpkg.log.1"); // around 800k in size
Find test = new Find();
for (int i=0; i<100; i++) {
for (Finder f : fs) {
test.test(f, log, "2014"); // start
test.test(f, log, "gnome"); // mid
test.test(f, log, "ubuntu1"); // end
test.test(f, log, ".1"); // multiple; not at start
}
}
test.printResults();
}
答案 3 :(得分:0)
while (line.contains("Hello")) { // search until line has "Hello"
appearances++;
line = line.replaceFirst("Hello",""); // replace first occurance of "Hello" with empty String
}