我有一个数据集,其中有两列:“班级”和“学生”。我正在尝试查找班级之间的重叠率(例如,班级A中的学生人数也占班级B的百分比)?
我认为应该通过在PowerQuery中扩展表以比较每个类,然后通过groupby来显示重叠百分比来完成此操作?有什么想法吗?
以下是一些示例数据和预期的输出:
所需的输出(随机产生的百分比):
答案 0 :(得分:1)
尝试以下代码:
let
Source = Table.Buffer(Excel.Workbook(File.Contents("C:\Path\YourWorkbook.xlsx"))
{[Item="YourTable",Kind="Table"]}[Data]),
group = Table.Group(Source, {"Class"}, {"CompareClass", each List.Distinct(Source[Class])}),
expand = Table.ExpandListColumn(group, "CompareClass"),
filter = Table.SelectRows(expand, each [Class] <> [CompareClass]),
add = Table.AddColumn(filter, "Overlap", each let
a = Table.RowCount(Table.SelectRows(Table.Group(Source, {"Student"},
{"c", each [Class]}), (x)=> List.ContainsAll(x[c],{[Class], [CompareClass]}))),
b = Table.RowCount(Table.SelectRows(Source, (x)=>x[Class] = [CompareClass]))
in a/b, Percentage.Type)
in
add
答案 1 :(得分:0)
您可以通过如下所述创建一些其他表来在功率查询中达到目的-
让您的源表名称:Course 表格中的列名称:班级,学生
现在,转到电源查询编辑器并执行以下步骤-
第1步::将源表“ Course”复制到3个新表中,并将其命名为-
"course_1"
"course_2"
"course_3"
步骤2:使用以下代码编辑“ course_1”高级查询代码-
let
Source = Excel.Workbook(File.Contents("D:\WORK\R&D\Book2.xlsx"), null, true),
course_percentage_Sheet = Source{[Item="course_percentage",Kind="Sheet"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(course_percentage_Sheet,{{"Column1", type text}, {"Column2", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"class", type text}, {"student", type text}}),
//-------------------------------
//-- This below part is new code
//-------------------------------
#"Grouped Rows" = Table.Group(#"Changed Type1", {"class"}, {{"Count", each Table.RowCount(_), Int64.Type}})
in
#"Grouped Rows"
第3步::使用下面的代码编辑“ course_2”高级查询代码-
let
Source = Excel.Workbook(File.Contents("D:\WORK\R&D\Book2.xlsx"), null, true),
course_percentage_Sheet = Source{[Item="course_percentage",Kind="Sheet"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(course_percentage_Sheet,{{"Column1", type text}, {"Column2", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"class", type text}, {"student", type text}}),
//-------------------------------
//-- This below part are new code
//-------------------------------
#"Merged Queries" = Table.NestedJoin(#"Changed Type1", {"student"}, #"Changed Type1", {"student"}, "Changed Type1", JoinKind.LeftOuter),
#"Expanded Changed Type1" = Table.ExpandTableColumn(#"Merged Queries", "Changed Type1", {"class", "student"}, {"Changed Type1.class", "Changed Type1.student"}),
#"Grouped Rows" = Table.Group(#"Expanded Changed Type1", {"class", "Changed Type1.class"}, {{"Count", each Table.RowCount(_), Int64.Type}})
in
#"Grouped Rows"
步骤4:使用以下代码编辑“ course_3”高级查询代码-
let
Source = Excel.Workbook(File.Contents("D:\WORK\R&D\Book2.xlsx"), null, true),
course_percentage_Sheet = Source{[Item="course_percentage",Kind="Sheet"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(course_percentage_Sheet,{{"Column1", type text}, {"Column2", type text}}),
#"Promoted Headers" = Table.PromoteHeaders(#"Changed Type", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"class", type text}, {"student", type text}}),
//-------------------------------
//-- This below part are new code
//-------------------------------
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"student"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Added Custom" = Table.AddColumn(#"Removed Duplicates", "Custom", each 1),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Custom"}, #"Added Custom", {"Custom"}, "Added Custom", JoinKind.FullOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"class", "Custom"}, {"Added Custom.class", "Added Custom.Custom"}),
#"Sorted Rows" = Table.Sort(#"Expanded Added Custom",{{"class", Order.Ascending}}),
#"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Custom", "Added Custom.Custom"}),
#"Merged Queries1" = Table.NestedJoin(#"Removed Columns1", {"class"}, course_1, {"class"}, "course_1", JoinKind.LeftOuter),
#"Expanded course_1" = Table.ExpandTableColumn(#"Merged Queries1", "course_1", {"Count"}, {"course_1.Count"}),
#"Merged Queries2" = Table.NestedJoin(#"Expanded course_1", {"class", "Added Custom.class"}, course_2, {"class", "Changed Type1.class"}, "course_2", JoinKind.LeftOuter),
#"Expanded course_2" = Table.ExpandTableColumn(#"Merged Queries2", "course_2", {"Count"}, {"course_2.Count"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded course_2",null,0,Replacer.ReplaceValue,{"course_2.Count"}),
#"Renamed Columns" = Table.RenameColumns(#"Replaced Value",{{"Added Custom.class", "class_compare"}, {"course_1.Count", "total_student"}, {"course_2.Count", "matched_student"}}),
#"Added Custom1" = Table.AddColumn(#"Renamed Columns", "percentage_matched", each [matched_student]/[total_student]),
#"Changed Type2" = Table.TransformColumnTypes(#"Added Custom1",{{"percentage_matched", Percentage.Type}})
in
#"Changed Type2"
第5步::如果要通过右键单击表格,禁用“ course_1”和“ course_2”的表格加载,并禁用“启用加载”旁边的对勾。
步骤6 :点击“关闭并应用”按钮返回报告。
步骤7::将表“ course_3”中“ percentage_matched”列的数据类型更改为百分比。
步骤8::将三列-class,class_compare和“ percentage_matched”添加到表格的外观中。最终输出应该类似于下面的图像-