从文本文件中扫描多个单词

时间:2014-01-29 18:57:27

标签: c# text-search

我有一个单词列表。我希望程序扫描文本文件中的多个单词。

这就是我已经拥有的:

int counter = 0;
        string line;
        StringBuilder sb = new StringBuilder();

        string[] words = { "var", "bob", "for", "example"};

        try
        {
            using (StreamReader file = new StreamReader("test.txt"))
            {
                while ((line = file.ReadLine()) != null)
                {
                    if (line.Contains(Convert.ToChar(words)))
                    {
                        sb.AppendLine(line.ToString());
                    }
                }
            }

            listResults.Text += sb.ToString();
        }
        catch (Exception ex)
        {
            listResults.ForeColor = Color.Red;
            listResults.Text = "---ERROR---";
        }

所以我想扫描一个单词的文件,如果不存在,请扫描下一个单词......

3 个答案:

答案 0 :(得分:2)

String.Contains()只接受一个参数:一个字符串。您对Contains(Convert.ToChar(words))的呼吁可能不是您所期望的。

Using C# to check if string contains a string in string array中所述,您可能希望执行以下操作:

using (StreamReader file = new StreamReader("test.txt"))
{
    while ((line = file.ReadLine()) != null)
    {
        foreach (string word in words)
        {
            if (line.Contains(word))
            {
                sb.AppendLine(line);
            }
        }
    }
}

或者,如果您想要按照确切的问题陈述(“扫描文件中的单词,如果不存在,请扫描下一个单词”),您可能需要查看一下在Return StreamReader to Beginning

using (StreamReader file = new StreamReader("test.txt"))
{
    foreach (string word in words)
    {
        while ((line = file.ReadLine()) != null)
        {
            if (line.Contains(word))
            {
                sb.AppendLine(line);
            }
        }

        if (sb.Length == 0)
        {
            // Rewind file to prepare for next word
            file.Position = 0;
            file.DiscardBufferedData();   
        }
        else
        {
            return sb.ToString();
        }
    }
}

但是这会认为“bob”是“bobcat”的一部分。如果您不同意,请参阅String compare C# - whole word match,并替换:

line.Contains(word)

string wordWithBoundaries = "\\b" + word + "\\b";
Regex.IsMatch(line, wordWithBoundaries);

答案 1 :(得分:0)

StringBuilder sb = new StringBuilder();             
string[] words = { "var", "bob", "for", "example" };
string[] file_lines = File.ReadAllLines("filepath");
for (int i = 0; i < file_lines.Length; i++)         
{                                                   
    string[] split_words = file_lines[i].Split(' ');
    foreach (string str in split_words)             
    {                                               
        foreach (string word in words)              
        {                                           
            if (str == word)                        
            {                                       
                sb.AppendLine(file_lines[i]);       
            }                                       
        }                                           
    }                                               
}                                                   

答案 2 :(得分:0)

这是一种享受:

var query =
    from line in System.IO.File.ReadLines("test.txt")
    where words.Any(word => line.Contains(word))
    select line;

要将这些作为单个字符串输出,请执行以下操作:

var results = String.Join(Environment.NewLine, query);

不可能简单得多。


如果你只想匹配整个单词,那就变得有点复杂了。你可以这样做:

Regex[] regexs =
    words
        .Select(word => new Regex(String.Format(@"\b{0}\b", Regex.Escape(word))))
        .ToArray();

var query =
    from line in System.IO.File.ReadLines(fileName)
    where regexs.Any(regex => regex.IsMatch(line))
    select line;