Powerquery附加文件给出错误

时间:2018-02-10 13:03:38

标签: excel powerbi powerquery

我试图附加10000个excel文件(每个文件大小为50-100 kb)。进入该过程的一半我遇到了PQ的错误。当我附加文件时,错误发生了一半,并且无法确定哪个.xlsx文件是导致问题的文件。

Error seen in PQ

PQ的“查询和连接”窗格同时显示以下错误:

This is what is seen in the Queries and Connection pane

如何解决此问题,除了手动逐一上传并在PQ上传查询,直到找到给我错误的文件?谢谢你的阅读!

1 个答案:

答案 0 :(得分:1)

我经常遇到PQ在遇到错误时失败的问题"错误" excel工作簿中的单元格,即使您已尝试删除前面步骤中的错误。我不清楚导致这种情况的标准,但我想知道这是否属于这种情况,因为它提到了一个" #VALUE!"那条消息中的错误?虽然PQ应该可以更优雅地处理它,但我做了一些查询,让我输入一个目录,它将返回该目录中每个excel文件中每个单元格错误的工作簿,工作表和行。我从未尝试使用10k excel文件,但如果我的代码被清理得更高效,它可能会足够快。

获取所有原始excel文件数据的查询如下所示:

let
    Source = Folder.Files(YOUR DIRECTORY HERE),
    #"Filtered Rows1" = Table.SelectRows(Source, each not Text.StartsWith([Name], "~")),
    #"Filtered Rows" = Table.SelectRows(#"Filtered Rows1", each Text.EndsWith([Extension], ".xlsx") or Text.EndsWith([Extension], ".xlsm")),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "WorkbookData", each Excel.Workbook([Content])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Folder Path", "Name", "WorkbookData"}),
    #"Expanded WorkbookData" = Table.ExpandTableColumn(#"Removed Other Columns", "WorkbookData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"WorkbookData.Data", "WorkbookData.Hidden", "WorkbookData.Item", "WorkbookData.Kind", "WorkbookData.Name"}),
    #"Filtered Rows2" = Table.SelectRows(#"Expanded WorkbookData", each ([WorkbookData.Kind] = "Sheet")),
    #"Removed Other Columns1" = Table.SelectColumns(#"Filtered Rows2",{"Folder Path", "Name", "WorkbookData.Name", "WorkbookData.Data"}),
    ExpandedData = Table.ExpandTableColumn(#"Removed Other Columns1", "WorkbookData.Data", Table.ColumnNames(Table.Combine(#"Removed Other Columns1"[WorkbookData.Data]))),
    IdentifySheets = Table.AddColumn(ExpandedData, "UniqueSheet", each [Folder Path]&[Name]&[WorkbookData.Name]),
    SheetRowCounts = Table.Group(IdentifySheets, {"UniqueSheet"}, {{"Count", each Table.RowCount(_), type number}}),
    #"Added Custom2" = Table.AddColumn(SheetRowCounts, "PerSheetRow", each List.Numbers(1, [Count], 1)),
    #"Expanded PerSheetIndex" = Table.ExpandListColumn(#"Added Custom2", "PerSheetRow"),
    IndexBase = Table.AddIndexColumn(#"Expanded PerSheetIndex", "Index", 0, 1),
    #"Added Index" = Table.AddIndexColumn(IdentifySheets, "Index", 0, 1),
    #"Merged Queries" = Table.NestedJoin(#"Added Index",{"Index"},IndexBase,{"Index"},"NewColumn",JoinKind.LeftOuter),
    #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"PerSheetRow"}, {"PerSheetRow"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded NewColumn",{"UniqueSheet", "Index"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns", List.Combine({{"Folder Path", "Name", "WorkbookData.Name", "PerSheetRow"}, List.RemoveMatchingItems(Table.ColumnNames(ExpandedData), {"Folder Path", "Name", "WorkbookData.Name"})}))
in
    #"Reordered Columns"

该部分设置为仅连接查询,因为我不想加载我正在检查的每个工作簿的每张表的数据。

我用来加载包含错误的行的查询如下所示:

let
Source = NAME OF THE QUERY ABOVE,
    #"Kept Errors" = Table.SelectRowsWithErrors(Source, Table.ColumnNames(Source)),
    ColumnList = Table.FromList(Table.ColumnNames(#"Kept Errors")),
    #"Added Custom" = Table.AddColumn(ColumnList, "Custom", each "ERROR"),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Replacements", each Record.FieldValues(_)),
    ErrorReplacements = Table.SelectColumns(#"Added Custom1",{"Replacements"}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Kept Errors", ErrorReplacements[Replacements]),
    #"Renamed Columns" = Table.RenameColumns(#"Replaced Errors",{{"PerSheetRow", "SheetRow"}, {"Name", "Workbook"}, {"WorkbookData.Name", "Sheet"}})
in
    #"Renamed Columns"

我无法找到一种方法让PQ转换为"错误"单元格到一个特定错误的字符串(可能是,我只是不知道如何),所以我只是用#34; ERROR"替换所有错误单元格。并在我的工作表上有条件格式以突出显示。

不能说明这对你的案例有多大功能,但它帮助我多次在excel文件集中找到错误单元格。