我需要在其他大序列中找到序列,例如{1,3,2,3}
和{1,3,2,3,4,3}
中存在{5,1,3,2,3}
。有没有办法快速使用IEnumerable
或其他内容?
答案 0 :(得分:5)
此方法将在父序列中找到可通过Equals()
进行比较的任何类型的子序列:
public static bool ContainsSubequence<T>(this IEnumerable<T> parent, IEnumerable<T> target)
{
bool foundOneMatch = false;
using (IEnumerator<T> parentEnum = parent.GetEnumerator())
{
using (IEnumerator<T> targetEnum = target.GetEnumerator())
{
// Get the first target instance; empty sequences are trivially contained
if (!targetEnum.MoveNext())
return true;
while (parentEnum.MoveNext())
{
if (targetEnum.Current.Equals(parentEnum.Current))
{
// Match, so move the target enum forward
foundOneMatch = true;
if (!targetEnum.MoveNext())
{
// We went through the entire target, so we have a match
return true;
}
}
else if (foundOneMatch)
{
return false;
}
}
return false;
}
}
}
你可以像这样使用它:
bool match = new[] {1, 2, 3}.ContainsSubsequence(new[] {1, 2}); // match == true
match = new[] {1, 2, 3}.ContainsSubsequence(new[] {1, 3}); // match == false
请注意,它假定目标序列没有null
个元素。
更新:感谢大家的支持,但上面的代码中确实存在错误!如果找到了部分匹配,但之后没有变成完全匹配,则该过程结束,而不是重置(当应用于{1, 2, 1, 2, 3}.ContainsSubsequence({1, 2, 3})
之类的内容时,这显然是无法纠正的)。
上面的代码非常适用于子序列的更常见定义(即不需要连续),但为了处理重置(大多数IEnumerators
不支持),需要预先列出目标序列。这导致以下代码:
public static bool ContainsSubequence<T>(this IEnumerable<T> parent, IEnumerable<T> target)
{
bool foundOneMatch = false;
var enumeratedTarget = target.ToList();
int enumPos = 0;
using (IEnumerator<T> parentEnum = parent.GetEnumerator())
{
while (parentEnum.MoveNext())
{
if (enumeratedTarget[enumPos].Equals(parentEnum.Current))
{
// Match, so move the target enum forward
foundOneMatch = true;
if (enumPos == enumeratedTarget.Count - 1)
{
// We went through the entire target, so we have a match
return true;
}
enumPos++;
}
else if (foundOneMatch)
{
foundOneMatch = false;
enumPos = 0;
if (enumeratedTarget[enumPos].Equals(parentEnum.Current))
{
foundOneMatch = true;
enumPos++;
}
}
}
return false;
}
}
此代码没有任何错误,但不适用于大型(或无限)序列。
答案 1 :(得分:4)
与@ dlev类似,但这也处理{1,1,1,2}.ContainsSubsequence({1,1,2})
public static bool ContainsSubsequence<T>(this IEnumerable<T> parent, IEnumerable<T> target)
{
var pattern = target.ToArray();
var source = new LinkedList<T>();
foreach (var element in parent)
{
source.AddLast(element);
if(source.Count == pattern.Length)
{
if(source.SequenceEqual(pattern))
return true;
source.RemoveFirst();
}
}
return false;
}
答案 2 :(得分:1)
您可以尝试这样的方法来帮助您入门。将此列表转换为字符串后,可以使用子字符串找到序列:
if (String.Join(",", numericList.ConvertAll<string>(x => x.ToString()).ToArray())
{
//get sequence
}
答案 3 :(得分:1)
如果要处理简单的可序列化类型,如果将数组转换为字符串,则可以非常轻松地执行此操作:
public static bool ContainsList<T>(this List<T> containingList, List<T> containedList)
{
string strContaining = "," + string.Join(",", containingList) + ",";
string strContained = "," + string.Join(",", containedList) + ",";
return strContaining.Contains(strContained);
}
请注意,这是一种扩展方法,因此您可以将其称为:
if (bigList.ContainsList(smallList))
{
...
}
答案 4 :(得分:0)
这对我有用
var a1 = new List<int> { 1, 2, 3, 4, 5 };
var a2 = new List<int> { 2, 3, 4 };
int index = -1;
bool res = a2.All(
x => index != -1 ? (++index == a1.IndexOf(x)) : ((index = a1.IndexOf(x)) != -1)
);
答案 5 :(得分:0)
此函数使用一些LINQ:
检查Listparent
是否包含List target
public static bool ContainsSequence<T>(this List<T> parent, List<T> target)
{
for (int fromElement = parent.IndexOf(target.First());
(fromElement != -1) && (fromElement <= parent.Count - target.Count);
fromElement = parent.FindIndex(fromElement + 1, p => p.Equals(target.First())))
{
var comparedSequence = parent.Skip(fromElement).Take(target.Count);
if (comparedSequence.SequenceEqual(target)) return true;
}
return false;
}
答案 6 :(得分:0)
这是一个经过充分研究的问题,根据我的研究,有两种算法optimal for this job, depending on your data。
即,它们是Knuth-Morris-Pratt算法和Boyer-Moore算法。
在这里,我提交my implementation of the KMP algorithm, originally reviewed here。
它被编写来处理长度不定的源或父序列,Int64.MaxValue
。
我们可以看到,内部实现返回一个索引序列,在该序列中发生子字符串或目标模式。您可以通过选择外墙呈现这些结果。
您可以像这样简单地使用它,
var contains = new[] { 1, 3, 2, 3, 4, 3 }.Contains(new[] { 1, 3, 2, 3 });
Here is a working fiddle that shows the code in action。
下面,按照我的回答完整注释的代码。
namespace Code
{
using System;
using System.Collections.Generic;
using System.Linq;
/// <summary>
/// A generic implementation of the Knuth-Morris-Pratt algorithm that searches,
/// in a memory efficient way, over a given <see cref="IEnumerable{T}"/>.
/// </summary>
public static class KMP
{
/// <summary>
/// Determines whether a sequence contains the search string.
/// </summary>
/// <typeparam name="T">
/// The type of elements of <paramref name="source"/>
/// </typeparam>
/// <param name="source">
/// A sequence of elements
/// </param>
/// <param name="pattern">The search string.</param>
/// <param name="equalityComparer">
/// Determines whether the sequence contains a specified element.
/// If <c>null</c>
/// <see cref="EqualityComparer{T}.Default"/> will be used.
/// </param>
/// <returns>
/// <c>true</c> if the source contains the specified pattern;
/// otherwise, <c>false</c>.
/// </returns>
/// <exception cref="ArgumentNullException">pattern</exception>
public static bool Contains<T>(
this IEnumerable<T> source,
IEnumerable<T> pattern,
IEqualityComparer<T> equalityComparer = null)
{
if (pattern == null)
{
throw new ArgumentNullException(nameof(pattern));
}
equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;
return SearchImplementation(source, pattern, equalityComparer).Any();
}
public static IEnumerable<long> IndicesOf<T>(
this IEnumerable<T> source,
IEnumerable<T> pattern,
IEqualityComparer<T> equalityComparer = null)
{
if (pattern == null)
{
throw new ArgumentNullException(nameof(pattern));
}
equalityComparer = equalityComparer ?? EqualityComparer<T>.Default;
return SearchImplementation(source, pattern, equalityComparer);
}
/// <summary>
/// Identifies indices of a pattern string in a given sequence.
/// </summary>
/// <typeparam name="T">
/// The type of elements of <paramref name="source"/>
/// </typeparam>
/// <param name="source">
/// The sequence to search.
/// </param>
/// <param name="patternString">
/// The string to find in the sequence.
/// </param>
/// <param name="equalityComparer">
/// Determines whether the sequence contains a specified element.
/// </param>
/// <returns>
/// A sequence of indices where the pattern can be found
/// in the source.
/// </returns>
/// <exception cref="ArgumentOutOfRangeException">
/// patternSequence - The pattern must contain 1 or more elements.
/// </exception>
private static IEnumerable<long> SearchImplementation<T>(
IEnumerable<T> source,
IEnumerable<T> patternString,
IEqualityComparer<T> equalityComparer)
{
// Pre-process the pattern
(var slide, var pattern) = GetSlide(patternString, equalityComparer);
var patternLength = pattern.Count;
if (patternLength == 0)
{
throw new ArgumentOutOfRangeException(
nameof(patternString),
"The pattern must contain 1 or more elements.");
}
var buffer = new Dictionary<long, T>(patternLength);
var more = true;
long sourceIndex = 0; // index for source
int patternIndex = 0; // index for pattern
using(var sourceEnumerator = source.GetEnumerator())
while (more)
{
more = FillBuffer(
buffer,
sourceEnumerator,
sourceIndex,
patternLength,
out T t);
if (equalityComparer.Equals(pattern[patternIndex], t))
{
patternIndex++;
sourceIndex++;
more = FillBuffer(
buffer,
sourceEnumerator,
sourceIndex,
patternLength,
out t);
}
if (patternIndex == patternLength)
{
yield return sourceIndex - patternIndex;
patternIndex = slide[patternIndex - 1];
}
else if (more && !equalityComparer.Equals(pattern[patternIndex], t))
{
if (patternIndex != 0)
{
patternIndex = slide[patternIndex - 1];
}
else
{
sourceIndex = sourceIndex + 1;
}
}
}
}
/// <summary>
/// Services the buffer and retrieves the value.
/// </summary>
/// <remarks>
/// The buffer is used so that it is not necessary to hold the
/// entire source in memory.
/// </remarks>
/// <typeparam name="T">
/// The type of elements of <paramref name="source"/>.
/// </typeparam>
/// <param name="buffer">The buffer.</param>
/// <param name="source">The source enumerator.</param>
/// <param name="sourceIndex">The element index to retrieve.</param>
/// <param name="patternLength">Length of the search string.</param>
/// <param name="value">The element value retrieved from the source.</param>
/// <returns>
/// <c>true</c> if there is potentially more data to process;
/// otherwise <c>false</c>.
/// </returns>
private static bool FillBuffer<T>(
IDictionary<long, T> buffer,
IEnumerator<T> source,
long sourceIndex,
int patternLength,
out T value)
{
bool more = true;
if (!buffer.TryGetValue(sourceIndex, out value))
{
more = source.MoveNext();
if (more)
{
value = source.Current;
buffer.Remove(sourceIndex - patternLength);
buffer.Add(sourceIndex, value);
}
}
return more;
}
/// <summary>
/// Gets the offset array which acts as a slide rule for the KMP algorithm.
/// </summary>
/// <typeparam name="T">
/// The type of elements of <paramref name="source"/>.
/// </typeparam>
/// <param name="pattern">The search string.</param>
/// <param name="equalityComparer">
/// Determines whether the sequence contains a specified element.
/// If <c>null</c>
/// <see cref="EqualityComparer{T}.Default"/> will be used.
/// </param>
/// <returns>A tuple of the offsets and the enumerated pattern.</returns>
private static (IReadOnlyList<int> Slide, IReadOnlyList<T> Pattern) GetSlide<T>(
IEnumerable<T> pattern,
IEqualityComparer<T> equalityComparer)
{
var patternList = pattern.ToList();
var slide = new int[patternList.Count];
int length = 0;
int patternIndex = 1;
while (patternIndex < patternList.Count)
{
if (equalityComparer.Equals(
patternList[patternIndex],
patternList[length]))
{
length++;
slide[patternIndex] = length;
patternIndex++;
}
else
{
if (length != 0)
{
length = slide[length - 1];
}
else
{
slide[patternIndex] = length;
patternIndex++;
}
}
}
return (slide, patternList);
}
}
}