从字符串中的给定List <string>查找最长的公共匹配子字符串

时间:2018-06-21 19:10:15

标签: c# arrays string substring longest-substring

我有一个字符串列表,我想在给定的字符串中找到它们出现的开始和结束索引。

我想找到原始字符串中存在的最长的公共子字符串,只打印它。

这是我的代码:

public static void Main(string[] args)
{
    //This is my original string where I want to find the occurance of the longest common substring
    string str = "will you consider the lic premium of my in-laws for tax exemption";

    //Here are the substrings which I want to compare   
    List<string> subStringsToCompare = new List<string>
    {
        "Life Insurance Premium",
        "lic",
        "life insurance",
        "life insurance policy",
        "lic premium",
        "insurance premium",
        "insurance premium",
        "premium"
    };

    foreach(var item in subStringsToCompare)
    {
        int start = str.IndexOf(item);

        if(start != -1)
        {
            Console.WriteLine("Match found: '{0}' at {1} till {2} character position", item, start, start + item.Length);
        }
    }
}

问题是我要3次而不是1次。我似乎无法弄清楚从所有子字符串中获得最长的公共匹配子字符串进行比较的条件。

  

我得到的输出:

     
      
  • 找到匹配项:“ lic”位于22到25个字符位置
  •   
  • 找到匹配项:22至33个字符位置处的“ lic premium”
  •   
  • 找到匹配项:在26至33个字符位置处为“高级”
  •   

  

预期输出:

     
      
  • 找到匹配项:22至33个字符位置处的“ lic premium”
  •   

.NET Fiddle

3 个答案:

答案 0 :(得分:2)

如果您只需要字符串列表中的完全匹配(而不是列表中字符串的子字符串),那么您就非常接近

string longest = null;
int longestStart = 0;
foreach(var item in subStringsToCompare)
{
    int start = str.IndexOf(item);

    if(start != -1 && (longest == null || item.Length > longest.Length))
    {
        longest = item;
        longestStart = start
    }
}

if (longest != null)
{
    Console.WriteLine("Match found: '{0}' at {1} till {2} character position", longest, longestStart, longestStart + longest.Length);
}

答案 1 :(得分:2)

这就是我在评论中建议的内容

public static void Main(string[] args)
{
    //This is my original string where I want to find the occurance of the longest common substring
    string str = "will you consider the lic premium of my in-laws for tax exemption";

    // Here are the substrings which I want to compare
    // (Sorted by length descending) 
    List<string> subStringsToCompare = new List<string>
    {
        "Life Insurance Premium",
        "life insurance policy",
        "insurance premium",
        "life insurance",
        "lic premium",
        "premium",
        "lic"
    };

    foreach(var item in subStringsToCompare)
    {
        int start = str.IndexOf(item);

        if(start != -1)
        {
            Console.WriteLine("Match found: '{0}' at {1} till {2} character position", item, start, start + item.Length);

            break; // Stop at the first match found
        }
    }
}

答案 2 :(得分:0)

我没有尝试过,但是请尝试以下操作:

List<string> matches = new List<string>();
for( int i = 0; i < str.Length; i++ )
{
    foreach ( string toCompare in subStringsToCompare )
    {
        if ( str.SubString( i, toCompare.Length ) == toCompare )
            matches.Add( toCompare );
    }
}

string longest = "";
foreach ( string match in matches )
{
    if ( match.Length > longest.Length )
        longest = match;
}