说我有这样的元组列表:
List<Tuple<string, string>> conflicts = new List<Tuple<string, string>>();
conflicts.Add(new Tuple<string, string>("Maths", "English"));
conflicts.Add(new Tuple<string, string>("Science", "French"));
conflicts.Add(new Tuple<string, string>("French", "Science"));
conflicts.Add(new Tuple<string, string>("English", "Maths"));
我想查看元组列表中的反向重复并删除它们,我将如何通过循环执行此操作?
注意:反复重复是指&#34;英语&#34;,&#34;数学&#34;和#34;数学&#34;,&#34;英语&#34;
注意:我的代码中的我的元组是使用SqlDataReader填充的,但我上面使用的示例与它的布局方式非常接近。
这看起来很简单,但它已经被整晚困住了
答案 0 :(得分:5)
使用自定义IEqualityComparer
public class TupleComparer : IEqualityComparer<Tuple<string, string>>
{
public bool Equals(Tuple<string, string> x, Tuple<string, string> y)
{
return (x.Item1 == y.Item1 && x.Item2 == y.Item2) ||
(x.Item1 == y.Item2 && x.Item2 == y.Item1);
}
public int GetHashCode(Tuple<string, string> obj)
{
return string.Concat(new string[] { obj.Item1, obj.Item2 }.OrderBy(x => x)).GetHashCode();
//or
//return (string.Compare(obj.Item1, obj.Item2) < 0 ? obj.Item1 + obj.Item2 : obj.Item2 + obj.Item1).GetHashCode();
}
}
您可以使用HashSet<Tuple<string, string>>
代替List<Tuple<string, string>>
var conflicts = new HashSet<Tuple<string, string>>(new TupleComparer());
conflicts.Add(new Tuple<string, string>("Maths", "English"));
conflicts.Add(new Tuple<string, string>("Science", "French"));
conflicts.Add(new Tuple<string, string>("French", "Science"));
conflicts.Add(new Tuple<string, string>("English", "Maths"));
答案 1 :(得分:4)
List<Tuple<string, string>> conflicts = new List<Tuple<string, string>>();
List<Tuple<string, string>> noConflicts = new List<Tuple<string, string>>();
conflicts.Add(new Tuple<string, string>("Maths", "English"));
conflicts.Add(new Tuple<string, string>("Science", "French"));
conflicts.Add(new Tuple<string, string>("French", "Science"));
conflicts.Add(new Tuple<string, string>("English", "Maths"));
foreach(Tuple<string,string> t in conflicts)
{
if(!noConflicts.Contains(t) && !noConflicts.Contains(new Tuple<string,string>(t.Item2,t.Item1)))
noConflicts.Add(t);
}
foreach(Tuple<string, string> t in noConflicts)
Console.WriteLine(t.Item1 + "," + t.Item2);
我确信有更好的方法,但它有效
答案 2 :(得分:3)
相当粗略的实施:
var distinct =
conflicts
.GroupBy(
x =>
{
var ordered = new[] { x.Item1, x.Item2 }.OrderBy(i => i);
return
new
{
Item1 = ordered.First(),
Item2 = ordered.Last(),
};
})
.Distinct()
.Select(g => g.First())
.Dump();
它命令元组中的项目,以便Maths,English和Engilsh,Maths相同,然后将它们放入匿名类型(再次调用Item1 / 2),然后依赖于匿名类型的结构相等来执行一个独特的,然后我只是从每个组中拉出第一个元组。
答案 3 :(得分:1)
问题在于你滥用Tuple<T,Y>
。如果{ "Math", "Science" }
和{ "Science" , "Math" }
可以互换,那么它们就不是对。您将其更多地用作string[2]
。例如,在Dictionary
中,Tuple<TKey,TValue>
是有意义的单独事物,它们具有正确的配对关系,而不仅仅是数据列表。
尝试使用List<List<string>>
这样更能代表您数据的内容,并允许您访问有用的List<T>
答案,例如this one。或者确实是List<Conflict>
,其中Conflict
包含List
,其中顺序对于平等并不重要。
答案 4 :(得分:1)
LINQ one liner。一定要喜欢它。
var noConflicts = conflicts.Select(c => new HashSet<string>() { c.Item1, c.Item2})
.Distinct(HashSet<string>.CreateSetComparer())
.Select(h => new Tuple<string, string>(h.First(), h.Last()));
这可以通过将所有内容发送到HashSet<T>
来实现,该CreateSetComparer()
具有Distinct()
方法,无论顺序如何,都可以^
执行[:digit:]
。
答案 5 :(得分:0)
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
var conflicts = new List<Tuple<string, string>>();
conflicts.Add(new Tuple<string, string>("Maths", "English"));
conflicts.Add(new Tuple<string, string>("Science", "French"));
conflicts.Add(new Tuple<string, string>("French", "Science"));
conflicts.Add(new Tuple<string, string>("English", "Maths"));
RemoveDupes(conflicts);
foreach(var i in conflicts) Console.WriteLine(i.Item1 + " " + i.Item2);
}
public static void RemoveDupes(List<Tuple<string, string>> collection){
var duplicates = collection
// indescriminate which value comes first
.Select((x, i) => new{ Item= new Tuple<string,string>(x.Item2.IsGreaterThan(x.Item1) ? x.Item2 : x.Item1,
x.Item2.IsGreaterThan(x.Item1) ? x.Item1 : x.Item2), Index = i})
// group on the now indescrimitate values
.GroupBy(x => x.Item)
// find duplicates
.Where(x => x.Count() > 1)
.Select(x => new {Items = x, Count=x.Count()})
// select all indexes but first
.SelectMany( x =>
x.Items.Select( b => b)
.Zip(Enumerable.Range( 1, x.Count ),
( j, i ) => new { Item = j, RowNumber = i }
)
).Where(x => x.RowNumber != 1);
foreach(var item in duplicates){
collection.RemoveAt(item.Item.Index);
}
}
}
public static class Ext{
public static bool IsGreaterThan(this string val, string compare){
return val.CompareTo(compare) == 1;
}
}
答案 6 :(得分:0)
避免表示AB / BA模糊性的最佳方法是使用不允许它们的数据模型。通过施加约束您可以实现这一点,在数据库中这是广泛使用的方法。如果我们说元组是有序的,那么就不会出现歧义
public class Ordered2StrTuple : Tuple<string, string>
{
public Ordered2StrTuple(string a, string b)
: this(a, b, String.CompareOrdinal(a,b))
{ }
private Ordered2StrTuple(string a, string b, int cmp)
: base(cmp > 0 ? b : a, cmp > 0 ? a : b)
{ }
}
现在任务非常简单:
var noConflicts = conflicts
.Select(s => new Ordered2StrTuple(s.Item1, s.Item2))
.Distinct();
比较需要按顺序与Equal保持一致,所以我删除了我在这里的通用版本。如果您只想进行一次重复数据删除,您可以这样:
var noConflicts = conflicts.Select(t =>
String.CompareOrdinal(t.Item1, t.Item2) > 0 ? new Tuple<string, string>(t.Item2, t.Item1) : t
).Distinct();