如何比较List和字符串,并使用c#查找字符串中的匹配单词?

时间:2017-08-23 08:50:44

标签: c# string list

列表和字符串是:

string Text;
List<string> Names = new List<string>();

现在将数据从数据库加载到列表中:

string connectionString = "Data Source=SANGEEN-PC;Initial Catalog=IS_Project;Integrated Security=True;Connection Timeout=0";

using (SqlConnection cnn = new SqlConnection(connectionString))
{
     try
     {
         SqlDataAdapter da = new SqlDataAdapter("select NamesValues from Names", cnn);
         DataSet ds = new DataSet();
         da.Fill(ds, "Names");

         foreach (DataRow row in ds.Tables["Names"].Rows)
         {
             Names.Add(row["NamesValues"].ToString());
         }
     }
     catch (Exception ex)
     {
         MessageBox.Show("Can not open connection ! ");
     }
 }

现在将数据加载到字符串:

Text = System.IO.File.ReadAllText(@"D:\Data-Sanitization-Project\Files\Test.txt");

现在我要比较名称和文字,以便:

  1. 查找所有也在字符串中的列表项(匹配的单词)并将其存储在列表或数组中。
  2. 将所有找到的匹配单词替换为&#34;名称&#34;。
  3. 计算匹配的单词。
  4. 示例:

    Names:                  Text:                                
    
    Sangeen Khan           I am Sangeen Khan and i am friend    
    Jhon                   Jhon. Jhon is friend of Wasim.                               
    Wasim
    Alexander
    Afridi
    

    理想的操作:

    Matched List/Array:    Matches:         Updated Text:         
    
     Sangeen Khan            4            I am "Name" and i am friend                 
     Jhon                                "Name". "Name" is friend of "Names".
     Wasim
    

    对于以上三点,我写了以下代码,但它没有用:

    var TextRead = File.ReadAllLines(text);
    HashSet<string> hashSet = new HashSet<string>(TextRead);
    
    foreach (string i in Names)
    {
       if (hashSet.Contains(i))
       {
           MessageBox.Show("found");
       }
    }
    

    我尽力解释我的问题,但是,如果您了解我需要编辑,请随时编辑我的问题。先感谢您。

4 个答案:

答案 0 :(得分:2)

  1. 查找所有也在字符串中的列表项(匹配的单词)并将其存储在列表或数组中。
  2. 将所有找到的匹配单词替换为“名称”。
  3. 计算匹配的单词。
    1. List<string> matchedWords = Names.Where(Text.Contains).ToList();
    2. matchedWords.ForEach(w => Text = Text.Replace(w, "Names"));
    3. int numMatchedWords = matchedWords.Count;
    4. 似乎numMatchedWords应该计算文本中的所有匹配项,所以即使重复也是如此。然后您可以使用以下方法(在Replace之前):

      此扩展程序查找文本中所有单词的出现次数:

      public static Dictionary<string, int> OccurencesInText(this IEnumerable<string> words, string text, StringComparison comparison = StringComparison.OrdinalIgnoreCase)
      {
          if (text == null) throw new ArgumentNullException(nameof(text));
      
          Dictionary<string, int> resultDict = new Dictionary<string, int>();
          foreach (string word in words.Distinct())
          {
              int wordOccurrences = 0;
              for(int i = 0; i < text.Length - word.Length; i++)
              {
                  string substring = text.Substring(i, word.Length);
                  if (substring.Equals(word, comparison)) wordOccurrences++;
              }
              resultDict.Add(word, wordOccurrences);
          }
          return resultDict;
      }
      

      用法:

      int numMatchedWords = matchedWords.OccurencesInText(Text).Sum(kv => kv.Value);
      

答案 1 :(得分:0)

您可以在循环中运行您的名称并使用Regex搜索名称, 我是Match.Count&gt; 0然后你可以替换文本并计算你的 全球MatchCount。

Match.Count == 0。 然后,您可以在第二个列表中添加此名称...然后您可以在for Each Loop之后从列表中删除名称。

        public static List<string> Names = new List<string>();
    public static string Text = "I am Sangeen Khan and i am friend Jhon. Jhon is friend of Wasim.";
    static void Main(string[] args)
    {
        Names.Add("Sangeen Khan");
        Names.Add("Jhon");
        Names.Add("Wasim");
        Names.Add("Alexander");
        Names.Add("Afridi");

        var matchCount = 0;
        var nameToRemove = new List<string>();
        foreach (var name in Names)
        {
            var regex = new Regex(name);
            var match = regex.Matches(Text);

            //Count of matches
            matchCount += match.Count;

            if (match.Count > 0)
            {
                Text = regex.Replace(Text, "\"Name\"");
            }
            else
            {
                nameToRemove.Add(name);
            }
        }
        nameToRemove.ForEach(name=> Names.Remove(name));
        Console.WriteLine($"Names: {string.Join(" ", Names)}");
        Console.WriteLine($"Count: {matchCount}");
        Console.WriteLine($"ReplaceText: {Text}");
        Console.ReadLine();
    }

输出

姓名:Sangeen Khan Jhon Wasim

数:4

ReplaceText:我是“姓名”,我是朋友“姓名”。 “姓名”是“姓名”的朋友。

答案 2 :(得分:0)

static void Main()
    {
        var count = 0;

        string text = "I am Sangeen Khan and i am friend Jhon. Jhnon is friend of Wasim.     ";
        List<string> Names = new List<string>() {"Sangeen Khan ", "Jhon","Wasim","Alexander","Afridi" };
        List<string> matchedList = new List<string>();

        foreach (var name in Names)
        {
            if(text.Contains(name))
            {
                text = text.Replace(name, "\"Name\" ");
                matchedList.Add(name);
                count++;
            }
        }

        foreach (var name in matchedList)
        {                
             Console.WriteLine(name);
        }

        Console.WriteLine(count);
        Console.WriteLine(text);

        Console.ReadLine();
    }

答案 3 :(得分:0)

如果您想使用&#39;名称&#39;替换列表中的名称,我会先执行此操作并计算出现的名称&#39;在你的文字中。类似的东西:

string[] names = new string[] { "Sangeen Khan", "Jhon", "Wasim", "Alexander", "Afridi" };
string text = "I am Sangeen Khan and i am friend Jhon. Jhon is friend of Wasim.";

foreach(string name in names)
{
    text = text.Replace(name, "'Name'");
}

int matches = Regex.Matches(Regex.Escape(text), "'Name'").Count;