折叠重复和半重复记录的代码?

时间:2013-05-21 16:47:38

标签: c# linq distinct

我有这种类型的模型列表:

public class TourDude {
    public int Id { get; set; }
    public string Name { get; set; }
}

这是我的清单:

    public IEnumerable<TourDude> GetAllGuides {
        get {
            List<TourDude> guides = new List<TourDude>();
            guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
            guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial", Id = 3 });
            return guides;

        }
    }

我想检索这些记录:

{ Name = "Dave Et", Id = 1 } 
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }

目标主要是折叠重复项和近似重复项(可通过ID确认),将最短的值(比较时)作为名称。

我从哪里开始?是否有完整的LINQ可以为我做这个?我是否需要对相等比较器进行编码?

修改1:

        var result = from x in GetAllGuides
                     group x.Name by x.Id into g
                     select new TourDude {
                         Test = Exts.LongestCommonPrefix(g),
                         Id = g.Key,
                     };

        IEnumerable<IEnumerable<char>> test = result.First().Test;

        string str = test.First().ToString();

2 个答案:

答案 0 :(得分:3)

如果您想按Id对项目进行分组,然后在每个组中找到Name s的最长公共前缀,那么您可以按以下方式执行此操作:

var result = from x in guides
             group x.Name by x.Id into g
             select new TourDude
             {
                 Name = LongestCommonPrefix(g),
                 Id = g.Key,
             };

使用该算法从here中找到最长的公共前缀。

结果:

{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }

static string LongestCommonPrefix(IEnumerable<string> xs)
{
    return new string(xs
        .Transpose()
        .TakeWhile(s => s.All(d => d == s.First()))
        .Select(s => s.First())
        .ToArray());
}

答案 1 :(得分:2)

我能够通过对ID上的记录进行分组然后从名称长度排序的每个组中选择第一条记录来实现这一目标:

var result = GetAllGuides.GroupBy(td => td.Id)
    .Select(g => g.OrderBy(td => td.Name.Length).First());

foreach (var dude in result)
{
    Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}