我有几个序列如下
我将它们保存在字符串列表
列表中List<List<String>> Sequences;
我想将它们合并,以便删除其他序列所涵盖的序列。例如,序列V VPC VPS S
由序列V MV VPC VPC VPS VPA S
覆盖,因为后者包含前者的所有元素并且顺序相同(此示例不在上面的列表中)。
我认为应该有Linq
的简单解决方案,但是我没有掌握它。
我的方法是迭代序列,并为每个序列找到与它们相交的序列本身并具有相同的顺序,如果是,则将其删除,如
foreach (var item in Sequences)
{
if (Sequences.Any(x => x.Intersect(item).SequenceEqual(item)))
{
Sequences.Remove(item);
}
}
答案 0 :(得分:3)
如果订单确实重要:
bool IsSubsequence<T>(IEnumerable<T> subseq, IEnumerable<T> superseq)
where T : IEquatable<T>
{
var subit = subseq.GetEnumerator();
if (!subit.MoveNext()) return true; // Empty subseq -> true
foreach (var superitem in superseq)
{
if (superitem.Equals(subit.Current))
{
if (!subit.MoveNext()) return true;
}
}
return false;
}
List<List<T>> PruneSequences<T>(List<List<T>> lists)
where T : IEquatable<T>
{
return lists
.Where(sublist =>
!lists.Any(superlist =>
sublist != superlist &&
IsSubsequence(sublist, superlist)))
.ToList();
}
用法:
var Sequences = new List<List<string>> {
new List<string> { "N", "MN", "MN", "S" },
new List<string> { "PUNC" },
new List<string> { "N" },
new List<string> { "V", "VPC", "VPS", "S" },
new List<string> { "N", "NPC" },
new List<string> { "N", "MN" },
new List<string> { "N", "NPA" },
new List<string> { "ADJ" },
new List<string> { "V", "MV", "VPC", "VPC", "VPSD", "VPA", "S" },
new List<string> { "PREP", "PPC", "PPC" },
new List<string> { "PRONC", "NPC" },
new List<string> { "JONJ", "CPC", "CPC", "VPC", "VPSD", "CLR" },
new List<string> { "CONJ" },
new List<string> { "AUX" },
new List<string> { "V", "MV", "VPC" },
new List<string> { "N", "NPA", "NPC", "NPC" }
};
var PrunedSequences = PruneSequences(Sequences);
结果:
N MN MN S
PUNC
V VPC VPS S
ADJ
V MV VPC VPC VPSD VPA S
PREP PPC PPC
PRONC NPC
JONJ CPC CPC VPC VPSD CLR
CONJ
AUX
N NPA NPC NPC
答案 1 :(得分:1)
Sequences.Where(i=>!Sequences.Any(x => ReferenceEquals(i,x) == false && x.Intersect(i).SequenceEqual(i)));
您的解决方案可能会失败,因为它会针对自身测试项目吗?