我试图找出,在其他字符串中搜索子字符串以获取内容示例的相同输出的正确方法是什么:
hello how are you
输入为true:
hey hello how are you ok
how are you
are you
并为此假:
you
how you are ok you
howareyou
howok
how you
hey hello
我希望字符串中包含的相同短语或短语的一部分为true,而不是另一个序列中的单个单词或单词。这种情况适用于(aList.Any(input.Contains))
的所有人以及(aList.Contains(input))
所有人的假:
List<string> aList = new List<string>() {
"hey hello how are you ok",
"how are you",
"are you",
"you",
"how you are ok you",
"howareyou",
"howok",
"how you",
"hey hello" };
string input = "hello how are you";
foreach (string a in aList)
{
if (a.Any(input.Contains))
{
Console.WriteLine(a + " - true");
}
else
{
Console.WriteLine(a + " - false");
}
}
Console.WriteLine("__\n\r");
foreach (string a in aList)
{
if (a.Contains(input))
{
Console.WriteLine(a + " - true");
}
else
{
Console.WriteLine(a + " - false");
}
}
答案 0 :(得分:2)
我提出了这个解决方案:
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Program
{
public static void Main()
{
List<string> aList = new List<string>()
{"hey hello how are you ok", "how are you", "are you", "you", "how you are ok you", "howareyou", "howok", "how you", "hey hello"};
var input = "hello how are you";
// build matcher
string[] chunks = input.Split(' ');
string matcher = "";
for (int i = chunks.Length, j = 0; i > 1; i--, j++){
var matcherPart = new string [i];
Array.Copy(chunks, j, matcherPart, 0, i);
matcher += "("+String.Join(@"+\s+", matcherPart) + ")";
}
matcher = matcher.Replace(")(", ")|(");
// Console.WriteLine(matcher);
//(hello+\s+how+\s+are+\s+you)|(how+\s+are+\s+you)|(are+\s+you)";
foreach (string a in aList)
{
Regex r = new Regex(matcher, RegexOptions.IgnoreCase);
Match m = r.Match(a);
Group g = m.Groups[0];
Console.WriteLine(a + " - " + (g.Captures.Count > 0));
}
/*
hey hello how are you ok - True
how are you - True
are you - True
you - False
how you are ok you - False
howareyou - False
howok - False
how you - False
hey hello - False
*/
}
}
构建匹配器部件使用可能的组合创建regexp,即此字符串a b c d
正在转换为:(a+b+c+d)|(b+c+d)|(c+d)
。有了这个,您可以轻松遍历列表值并应用正则表达式。 g.Captures.Count
会告诉您列表的项目是否与您的模式匹配。
答案 1 :(得分:1)
这应该适用于你现在,因为你没有指定其他任何东西。 return true
string [] array = input.Split(' ');
foreach(string a in list)
{
bool yes = false;
for(int i = 0; i < array.Length-1; ++i ){
string test = array[i] + " " + array[i+1];
if(a.Contains(test)){
yes = true;
}
}
Console.WriteLine(yes);
}
答案 2 :(得分:1)
我会把这个问题分成两步。
1)Split
' '
(空格)的输入,并汇总所有可能匹配的subphrases
:
string input = "hello how are you";
string[] inputParts = input.Split(' ');
List<string> subphrases = new List<string>();
for (int i = 0; i < inputParts.Length-1; i++)
{
subphrases.Add(string.Join(" ", inputParts.Skip(i)));
}
2)从您的语料库/ aList
中仅取出Where
集合中{@ 1}}个元素的Contains
个Any
subphrases
项:
List<string> all_TRUE_Matches = aList.Where(
corpusItem => subphrases.Any(sub => corpusItem.Contains(sub))).ToList();
代码中的顺序与我的句子不同。对于Any
中true
来自corpusItem
集合中至少1个元素的aList
,Contains
将返回subphrases
。 (我希望这更清楚;))
要获取FALSE
个匹配项列表,只需取消Any
条件:
List<string> all_FALSE_Matches = aList.Where(
corpusItem => !subphrases.Any(sub => corpusItem.Contains(sub))).ToList();