我试图附加10000个excel文件(每个文件大小为50-100 kb)。进入该过程的一半我遇到了PQ的错误。当我附加文件时,错误发生了一半,并且无法确定哪个.xlsx文件是导致问题的文件。
PQ的“查询和连接”窗格同时显示以下错误:
如何解决此问题,除了手动逐一上传并在PQ上传查询,直到找到给我错误的文件?谢谢你的阅读!
答案 0 :(得分:1)
我经常遇到PQ在遇到错误时失败的问题"错误" excel工作簿中的单元格,即使您已尝试删除前面步骤中的错误。我不清楚导致这种情况的标准,但我想知道这是否属于这种情况,因为它提到了一个" #VALUE!"那条消息中的错误?虽然PQ应该可以更优雅地处理它,但我做了一些查询,让我输入一个目录,它将返回该目录中每个excel文件中每个单元格错误的工作簿,工作表和行。我从未尝试使用10k excel文件,但如果我的代码被清理得更高效,它可能会足够快。
获取所有原始excel文件数据的查询如下所示:
let
Source = Folder.Files(YOUR DIRECTORY HERE),
#"Filtered Rows1" = Table.SelectRows(Source, each not Text.StartsWith([Name], "~")),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows1", each Text.EndsWith([Extension], ".xlsx") or Text.EndsWith([Extension], ".xlsm")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "WorkbookData", each Excel.Workbook([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Folder Path", "Name", "WorkbookData"}),
#"Expanded WorkbookData" = Table.ExpandTableColumn(#"Removed Other Columns", "WorkbookData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"WorkbookData.Data", "WorkbookData.Hidden", "WorkbookData.Item", "WorkbookData.Kind", "WorkbookData.Name"}),
#"Filtered Rows2" = Table.SelectRows(#"Expanded WorkbookData", each ([WorkbookData.Kind] = "Sheet")),
#"Removed Other Columns1" = Table.SelectColumns(#"Filtered Rows2",{"Folder Path", "Name", "WorkbookData.Name", "WorkbookData.Data"}),
ExpandedData = Table.ExpandTableColumn(#"Removed Other Columns1", "WorkbookData.Data", Table.ColumnNames(Table.Combine(#"Removed Other Columns1"[WorkbookData.Data]))),
IdentifySheets = Table.AddColumn(ExpandedData, "UniqueSheet", each [Folder Path]&[Name]&[WorkbookData.Name]),
SheetRowCounts = Table.Group(IdentifySheets, {"UniqueSheet"}, {{"Count", each Table.RowCount(_), type number}}),
#"Added Custom2" = Table.AddColumn(SheetRowCounts, "PerSheetRow", each List.Numbers(1, [Count], 1)),
#"Expanded PerSheetIndex" = Table.ExpandListColumn(#"Added Custom2", "PerSheetRow"),
IndexBase = Table.AddIndexColumn(#"Expanded PerSheetIndex", "Index", 0, 1),
#"Added Index" = Table.AddIndexColumn(IdentifySheets, "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(#"Added Index",{"Index"},IndexBase,{"Index"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"PerSheetRow"}, {"PerSheetRow"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded NewColumn",{"UniqueSheet", "Index"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns", List.Combine({{"Folder Path", "Name", "WorkbookData.Name", "PerSheetRow"}, List.RemoveMatchingItems(Table.ColumnNames(ExpandedData), {"Folder Path", "Name", "WorkbookData.Name"})}))
in
#"Reordered Columns"
该部分设置为仅连接查询,因为我不想加载我正在检查的每个工作簿的每张表的数据。
我用来加载包含错误的行的查询如下所示:
let
Source = NAME OF THE QUERY ABOVE,
#"Kept Errors" = Table.SelectRowsWithErrors(Source, Table.ColumnNames(Source)),
ColumnList = Table.FromList(Table.ColumnNames(#"Kept Errors")),
#"Added Custom" = Table.AddColumn(ColumnList, "Custom", each "ERROR"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Replacements", each Record.FieldValues(_)),
ErrorReplacements = Table.SelectColumns(#"Added Custom1",{"Replacements"}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Kept Errors", ErrorReplacements[Replacements]),
#"Renamed Columns" = Table.RenameColumns(#"Replaced Errors",{{"PerSheetRow", "SheetRow"}, {"Name", "Workbook"}, {"WorkbookData.Name", "Sheet"}})
in
#"Renamed Columns"
我无法找到一种方法让PQ转换为"错误"单元格到一个特定错误的字符串(可能是,我只是不知道如何),所以我只是用#34; ERROR"替换所有错误单元格。并在我的工作表上有条件格式以突出显示。
不能说明这对你的案例有多大功能,但它帮助我多次在excel文件集中找到错误单元格。