如何在PowerQuery中创建自定义索引列?

时间:2015-04-29 12:13:29

标签: powerquery

我在PowerQuery中有以下数据:

| ParentX | A |
| ParentY | A |
| ParentZ | A |
| ParentY | B |
| ParentZ | B |
| ParentX | C |
| ParentY | C |
| ParentZ | C |

我想添加一个索引列,用于计算元素的父项数:

| ParentX | A | 3 |
| ParentY | A | 2 |
| ParentZ | A | 1 |
| ParentY | B | 2 |
| ParentZ | B | 1 |
| ParentX | C | 3 |
| ParentY | C | 2 |
| ParentZ | C | 1 |

最终目标是根据这个新列进行调整:

| Object | Root    | Parent 2 | Parent 3 |
| A      | ParentZ | ParentY  | ParentX  |
| B      | ParentZ | ParentY  |          |
| C      | ParentZ | ParentY  | ParentX  |

2 个答案:

答案 0 :(得分:3)

这是我用来在问题中生成索引列的查询:

let
    // This has the original parent/child column
    Source = #"Parent Child Query",

    // Count the number of parents per child
    #"Grouped Rows" = Table.Group(Source, {"Attribute:id"}, {{"Count", each Table.RowCount(_), type number}}),

    // Add a new column of lists with the indexes per child
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "ParentIndex", each List.Numbers([Count], [Count], -1)),

    // Expand the lists in the previous step
    #"Expand ParentIndex" = Table.ExpandListColumn(#"Added Custom", "ParentIndex"),

    // Create the column name columns (Parent.1, Parent.2, etc)
    #"Added Custom1" = Table.AddColumn(#"Expand ParentIndex", "ParentColumn", each "Parent."&Text.From([ParentIndex])),

    // Adds an index column that you use when merging with the original table
    #"Added Index" = Table.AddIndexColumn(#"Added Custom1", "Index", 0, 1)
in
    #"Added Index"

完成此操作后,我创建了另一个查询来保存合并结果:

let
    // This is the original parent/child column
    Source = #"Parent Child Query",

    // Add an index column that matches the index column in the previous query
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),

    // Merge the two queries based on the index columns
    Merge = Table.NestedJoin(#"Added Index",{"Index"},#"Epic Parent Indices",{"Index"},"NewColumn"),

    // Expand the new column
    #"Expand NewColumn" = Table.ExpandTableColumn(Merge, "NewColumn", {"ParentColumn"}, {"ParentColumn"}),

    // Remove the index column
    #"Removed Columns" = Table.RemoveColumns(#"Expand NewColumn",{"Index"}),

    // Sort the data by attribute and then by Parent column so the columns will be in the right order
    #"Sorted Rows" = Table.Sort(#"Removed Columns",{{"Attribute:id", Order.Descending}, {"ParentColumn", Order.Ascending}}),

    // Pivot!
    #"Pivoted Column" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[ParentColumn]), "ParentColumn","Parent:id")
in
    #"Pivoted Column"

这里有三个关键步骤:

  1. 使用Table.Group获取每个子元素的父项数。
  2. 使用List.Numbers获取每个父/子关系的索引值。
  3. 使用Table.AddIndexColumn添加索引列,以用作Table.Join调用中的键。如果您不这样做,那么您将在合并中获得重复数据。

答案 1 :(得分:0)

  1. 创建包含2列(ParentsChild
  2. 的Excel表格
  3. 在Power Query
  4. 中使用此表
  5. 插入函数Combiner.CombineTextByDelimiter(";")(参见第3行)
  6. Child分组并使用上述功能(参见第4行)
  7. 拆分结果(第5行)
  8. 代码:

    let
        Quelle    = Excel.CurrentWorkbook(){[Name="Tabelle2"]}[Content],
        fcombine  = Combiner.CombineTextByDelimiter(";"), 
        #"Group1" = Table.Group(Quelle, {"Child"}, {{"Parents", each fcombine([Parent]), type text}}),
        #"Split1" = Table.SplitColumn(#"Group1", "Parents", Splitter.SplitTextByDelimiter(";"),{"Parents.1", "Parents.2", "Parents.3"}),
        #"Result" = Table.TransformColumnTypes(#"Split1", {{"Parents.1", type text}, {"Parents.2", type text}, {"Parents.3", type text}})
    in
        #"Result"
    

    问候 R上。