将数据从平面阵列转换为分层结构

时间:2013-10-15 15:25:47

标签: c# .net data-structures

我正在寻求从平面列表到分层结构的数据转换。我怎样才能以可读的方式实现这一目标,但性能仍然可以接受,并且我可以利用任何.NET库。我认为这在某些术语(在这种情况下是行业)中被视为“方面”。

public class Company
{        
    public int CompanyId { get; set; }
    public string CompanyName { get; set; }
    public Industry Industry { get; set; }
}

public class Industry
{
    public int IndustryId { get; set; }
    public string IndustryName { get; set; }
    public int? ParentIndustryId { get; set; }
    public Industry ParentIndustry { get; set; }
    public ICollection<Industry> ChildIndustries { get; set; }
}

现在假设我有一个List<Company>,我希望将其转换为List<IndustryNode>

//Hierarchical data structure
public class IndustryNode
{
    public string IndustryName{ get; set; }
    public double Hits { get; set; }
    public IndustryNode[] ChildIndustryNodes{ get; set; }
}

这样生成的对象在序列化之后应该如下所示:

{
    IndustryName: "Industry",
    ChildIndustryNodes: [
        {
            IndustryName: "Energy",
            ChildIndustryNodes: [
                {
                    IndustryName: "Energy Equipment & Services",
                    ChildIndustryNodes: [
                        { IndustryName: "Oil & Gas Drilling", Hits: 8 },
                        { IndustryName: "Oil & Gas Equipment & Services", Hits: 4 }
                    ]
                },
                {
                    IndustryName: "Oil & Gas",
                    ChildIndustryNodes: [
                        { IndustryName: "Integrated Oil & Gas", Hits: 13 },
                        { IndustryName: "Oil & Gas Exploration & Production", Hits: 5 },
                        { IndustryName: "Oil & Gas Refining & Marketing & Transporation", Hits: 22 }
                    ]
                }
            ]
        },
        {
            IndustryName: "Materials",
            ChildIndustryNodes: [
                {
                    IndustryName: "Chemicals",
                    ChildIndustryNodes: [
                        { IndustryName: "Commodity Chemicals", Hits: 24 },
                        { IndustryName: "Diversified Chemicals", Hits: 66 },
                        { IndustryName: "Fertilizers & Agricultural Chemicals", Hits: 22 },
                        { IndustryName: "Industrial Gases", Hits: 11 },
                        { IndustryName: "Specialty Chemicals", Hits: 43 }
                    ]
                }
            ]
        }
    ]
}

“点击率”是指属于该群体的公司数量。

为了澄清,我需要将List<Company>转换为List<IndustryNode> NOT序列化List<IndustryNode>

4 个答案:

答案 0 :(得分:1)

试试这个:

    private static IEnumerable<Industry> GetAllIndustries(Industry ind)
    {
        yield return ind;
        foreach (var item in ind.ChildIndustries)
        {
            foreach (var inner in GetAllIndustries(item))
            {
                yield return inner;
            }
        }
    }

    private static IndustryNode[] GetChildIndustries(Industry i)
    {
        return i.ChildIndustries.Select(ii => new IndustryNode()
        {
            IndustryName = ii.IndustryName,
            Hits = counts[ii],
            ChildIndustryNodes = GetChildIndustries(ii)
        }).ToArray();
    }


    private static Dictionary<Industry, int> counts;
    static void Main(string[] args)
    {
        List<Company> companies = new List<Company>();
        //...
        var allIndustries = companies.SelectMany(c => GetAllIndustries(c.Industry)).ToList();
        HashSet<Industry> distinctInd = new HashSet<Industry>(allIndustries);
        counts = distinctInd.ToDictionary(e => e, e => allIndustries.Count(i => i == e));
        var listTop = distinctInd.Where(i => i.ParentIndustry == null)
                        .Select(i =>  new IndustryNode()
                                {
                                    ChildIndustryNodes = GetChildIndustries(i),
                                    Hits = counts[i],
                                    IndustryName = i.IndustryName
                                }
                        );
    }

未测试

答案 1 :(得分:0)

您正在寻找序列化程序。 MSFT有一个原生VS,但我喜欢Newtonsofts,它是免费的。 MSFT文档和示例为here,Newtonsoft文档为here

Newtonsoft免费,简单,快捷。

答案 2 :(得分:0)

尝试使用json序列化程序来实现此目的。我看到你的数据结构还可以,这只是序列化问题。

var industryNodeInstance = LoadIndustryNodeInstance();

var json = new JavaScriptSerializer().Serialize(industryNodeInstance);

如果您想在序列化器之间进行选择,请参阅: http://www.servicestack.net/benchmarks/#burningmonk-benchmarks

LoadIndustryNodeInstance方法

  • 构建List<Industry>

  • 转换IndustryTree = List<IndustryNode>

  • 实现Tree方法,例如Traverse。试着看看 Tree data structure in C#

答案 3 :(得分:0)

这是一些伪造的代码,可能会帮助您顺利完成。我创建一个地图/字典索引并用公司列表填充它。然后我们从索引中提取顶级节点。请注意,可能存在边缘情况(例如,最初可能需要部分填充此索引,因为您的公司似乎没有引用任何顶级节点,因此必须以其他方式填充这些节点)

Dictionary<String, IndustryNode> index = new Dictionary<String, IndustryNode>();

public void insert(Company company)
{ 
    if(index.ContainsKey(company.Industry.IndustryName))
    {
        index[company.Industry.IndustryName].hits++;
    }
    else
    {
        IndustryNode node = new IndustryNode(IndustryName=company.Industry, Hits=1);
        index[node.IndustryName] = node;
        if(index.ContainsKey(company.Industry.ParentIndustry.IndustryName))
        {
            index[company.Industry.ParentIndustry.IndustryName].ChildrenIndustries.Add(node);
        }
    }    
}

List<IndustryNode> topLevelNodes = index
    .Where(kvp => kvp.Item.ParentIndustry == null)
    .ToList(kvp => kvp.Item);