我在C#4.0中有一个HashSet<T>
层次结构对象。主键是一个int,但偶尔会有二次密钥重复。我想将条目与重复的二级密钥合并。在此示例中,辅助键是Name:
struct Element
{
int ID;
string Name;
List<int> Children;
List<int> Parents;
public override int GetHashCode()
{
return ID;
}
}
HashSet<Element> elements = new HashSet<Element>();
// Example Elements
elements.Add(1, "Apple", Children = {10, 11, 12}, Parents = {13,14,15});
elements.Add(2, "Banana", Children = {20, 21, 22}, Parents = {23,24,25});
elements.Add(3, "Apple", Children = {30, 31, 32}, Parents = {33,34,35});
elements.Add(4, "Food", Children = {1, 2, 3}, Parents = {});
目标是删除第3个条目{3,&#34; Apple&#34;,...},然后更新并合并其他剩余元素中的Parent和Children引用;最终结果应该是:
{ 1, "Apple", Children = { 10, 11, 12, 30, 31, 32 }, Parents = { 13,14,15, 33, 34, 35 }}
{ 2, "Banana", Children = { 20, 21, 22 }, Parents = { 23,24,25 }}
{ 4, "Food", Children = {1, 2}, Parents = {} }
这是我到目前为止所做的,但我无法找到更新HashSet的最佳方法。我首先复制HashSet,以便在迭代时可以删除。首先,我找到了重复项。如果有重复项我想要更新,并将它们从副本中删除。这就是我被卡住的地方。一旦我更新了重复项,我想删除它们,并阻止使用跳过列表再次处理它们:
var copy = new HashSet<Element>(Elements);
HashSet<int> skip = new HashSet<int>();
foreach (var e in Elements)
{
if (!skip.Contains(e.ID)
{
var duplicates = Elements.Where(x => e.Name == x.Name && e.ID != x.ID);
if (duplicates.Any())
{
foreach (var d in duplicates)
{
// Iterate copy and update Parent and Children references
// How do I do this part?
}
// Remove the duplicates from the copied list
copy.RemoveWhere(x => duplicates.Select(x => x.ID)
.Contains(x.ID));
// Don't process the duplicates again
skip.UnionWith(duplicates);
}
}
}
return copy;
我已经陷入困境。另外,有没有一种方法可以用Linq做到这一点?
更新:列表已经是这样的,我无法控制初始内容。我想我可以创建一个新的包装器,它有一个更好的Add方法来防止重复。
答案 0 :(得分:2)
尝试添加此单个字段元素。
struct Element
{
int ID;
string Name;
List<int> Children;
List<int> Parents;
Bool duplicate;
}
HashSet<Element> Elements = new HashSet();
// Example Elements
Elements.Add(1, "Apple", Children = {10, 11, 12}, Parents = {13,14,15}, duplicate = false);
Elements.Add(2, "Banana", Children = {20, 21, 22}, Parents = {23,24,25}, duplicate = false);
Elements.Add(3, "Apple", Children = {30, 31, 32}, Parents = {33,34,35}, duplicate = false);
Elements.Add(4, "Food", Children = {1, 2, 3}, Parents = {}, duplicate = false);
在您复制副本时,请标记&#34;复制&#34;为真。或者添加&#34;删除&#34;元素,所以你不要重新处理。管他呢。关键是,再添加一个元素。您可以随时复制元素并在添加时创建新元素。
要添加新浪的评论,您可以拥有如此的密钥:
class ElementKey {
int ID;
string Name;
}
class Element {
ElementKey Key;
List<int> Children;
List<int> Parents;
ProcessFlagSet flags;
}
class ProcessFlagSet {
bool Processed;
bool Duplicate;
}
Dictionary<ElementKey,Element> ...
然后,您可以稍后从ProcessFlagSet中删除所有元素,以便轻松进行重构。如果你不需要它们,他们会打破编辑,直到他们被删除。
最后,我想建议您在此处创建自己的Add方法。我希望您考虑传入要添加的元素,然后检查是否存在添加的密钥。这样可以省去一步。
答案 1 :(得分:2)
你可以试试这个:
var temp = Elements.GroupBy(e => e.Name)
.Select(g => new Element
{
ID = g.OrderBy(e => e.ID).First().ID,
Name = g.Key,
Children = g.SelectMany(e => e.Children).ToList(),
Parents = g.SelectMany(e => e.Parents).ToList()
});
var duplicates = Elements.Where(e => !temp.Any(t => t.ID == e.ID))
.Select(e => e.ID)
.Distinct();
Elements = new HashSet<Element>(temp);
foreach (Element e in Elements)
{
e.Children.RemoveAll(i => duplicates.Contains(i));
e.Parents.RemoveAll(i => duplicates.Contains(i));
}
据我了解,您只需按Name
对所有元素进行分组,然后选择最低ID
并加入Children
和Parents
。显然,这是通过此查询完成的。
答案 2 :(得分:1)
如果我理解正确,你想:
可以使用以下代码完成:
// Find all duplicated elements and remove them
var duplicates = Elements.GroupBy(x => x.Name)
.Where(x => x.Count() > 1)
.SelectMany(x => x.OrderBy(e => e.ID)
.Skip(1)
.Select(e => new { Element = e, NewID = x.Min(y => y.ID) }))
.ToDictionary(x => x.Element.ID, x => new { x.Element, x.NewID });
Elements.ExceptWith(duplicates.Values.Select(x => x.Element));
// Update the Children and Parents of each remaining element
foreach (var element in Elements)
{
var removed = duplicates.Where(x => x.Value.Element.Name == element.Name);
var mergedChildren = element.Children.Union(removed.SelectMany(x => x.Value.Element.Children))
.Select(x => duplicates.ContainsKey(x) ? duplicates[x].NewID : x)
.Distinct().ToList();
element.Children.Clear();
element.Children.AddRange(mergedChildren);
var mergedParents = element.Parents.Union(removed.SelectMany(x => x.Value.Element.Parents))
.Select(x => duplicates.ContainsKey(x) ? duplicates[x].NewID : x)
.Distinct().ToList();
element.Parents.Clear();
element.Parents.AddRange(mergedParents);
}