如果我有文本文件
“如果你不能隐藏,就不要跑,否则你会被打成两串,你的邪恶男人”
我想计算你在文本文件中的单词的次数,并将该值放入int变量。
我该如何做这样的事情?
答案 0 :(得分:12)
用正则表达式说出来......
Console.WriteLine((new Regex(@"(?i)you")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)
或者如果你需要单词作为独立单词
Console.WriteLine((new Regex(@"(?i)\byou\b")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)
编辑:为了正确起见,用(?i)\ byou \ b替换\ s +你\ s +
答案 1 :(得分:10)
string s = "dont run if you cant hide, or you will be broken in two strings, your a evil man";
var wordCounts = from w in s.Split(' ')
group w by w into g
select new { Word = g.Key, Count = g.Count() };
int youCount = wordCounts.Single(w => w.Word == "you").Count;
Console.WriteLine(youCount);
理想情况下,应忽略标点符号。我会让你处理这样一个凌乱的细节。
答案 2 :(得分:5)
假设有常规的换行符,那么如果文件很大,这将比其他方法的内存密集度更低。使用杰森的计数方法:
var total = 0;
using(StreamReader sr=new StreamReader("log.log"))
{
while (!sr.EndOfStream)
{
var counts = sr
.ReadLine()
.Split(' ')
.GroupBy(s => s)
.Select(g => new{Word = g.Key,Count = g.Count()});
var wc = counts.SingleOrDefault(c => c.Word == "you");
total += (wc == null) ? 0 : wc.Count;
}
}
<击> 或者,将Scoregraphic的答案与IEnumerable方法结合起来:
static IEnumerable<string> Lines(string filename)
{
using (var sr = new StreamReader(filename))
{
while (!sr.EndOfStream)
{
yield return sr.ReadLine();
}
}
}
你可以得到一个漂亮的单行
Lines("log.log")
.Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
.Sum();
击> <击> 撞击>
或者使用框架方法File.ReadLines()
,您可以将其减少为:
File.ReadLines("log.log")
.Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
.Sum();
答案 3 :(得分:3)
从文件中读取:
int count;
using (StreamReader reader = File.OpenText("fileName")
{
string contents = reader.ReadToEnd();
MatchCollection matches = Regex.Matches(contents, "\byou\b");
count = matches.Count;
}
请注意,如果您使用“\byou\b
”,则只会匹配单词“you”。如果你想在其他单词中匹配“你”(例如,“你的”中的“你”),请使用“你”作为模式而不是“\ byou \ b”。
答案 4 :(得分:2)
尝试正则表达式:
Regex r = new Regex("test");
MatchCollection matches = r.Matches("this is a test of using regular expressions to count how many times test is said in a string");
int iCount = matches.Count;
答案 5 :(得分:1)
以下方法可以胜任。
public Int32 GetWordCountInFile(String fileName, String word, Boolean ignoreCase)
{
return File
.ReadAllText(fileName)
.Split(new [] { ' ', '.', ',' })
.Count(w => String.Compare(w, word, ignoreCase));
}
也许你必须在String.Split()
电话中添加一些其他可能的分隔符。
答案 6 :(得分:1)
尝试使用indexOf计算出现次数,然后移动到下一个条目。 E.g。
using System;
namespace CountOcc
{
class Program
{
public static void Main(string[] args)
{
int StartPos; // Current pos in file.
System.IO.StreamReader sr = new System.IO.StreamReader( "c:\\file.txt" );
String Str = sr.ReadToEnd();
int Count = 0;
StartPos = 0;
do
{
StartPos = Str.IndexOf( "Services", StartPos );
if ( StartPos >= 0 )
{
StartPos++;
Count++;
}
} while ( StartPos >= 0 );
Console.Write("File contained " + Count + " occurances");
Console.ReadKey(true);
}
}
}