根据分隔符将文本文件拆分为行和列

时间:2019-06-08 20:10:10

标签: powerbi powerquery

我有一个文本文件,其数据内容如下:

<SUB    A=a; B=b; C=c; D=d <END <SUB    A=e; B=f; C=g; D=h <END <SUB    A=i; B=j; C=k; D=l <END...

我希望使用power bi来获得下表。

Expected table

A   B   C   D
a   b   c   d
e   f   g   h
i   j   k   l
…   …   …   …

我曾尝试根据半列和等号对数据进行分隔,但我没有所需的结果。

当我加载文本文件时,有两列不是必需的:

<SUB    
        A=a
        B=b
        C=c
        D=d
<END    
<SUB    
        A=e
        B=f
        C=g
        D=h
<END

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

您可以尝试以下代码。我直接将文本(从您的问题粘贴而来)分配给表达式,但我想您是通过File.ContentsText.FromBinaryLines.FromBinary(或类似的东西)一起加载文本文件的)。

方法1

let
    textFromFile = "<SUB    A=a; B=b; C=c; D=d <END <SUB    A=e; B=f; C=g; D=h <END <SUB    A=i; B=j; C=k; D=l <END", // copy-pasted from your question
    betweenEachSubAndEnd = 
        let
            startIndexes = List.Transform(Text.PositionOf(textFromFile, "<SUB", Occurrence.All), each _ + Text.Length("<SUB")),
            endIndexes = Text.PositionOf(textFromFile, "<END", Occurrence.All),
            tableOfIndexes = Table.FromColumns({startIndexes, endIndexes}, type table [start = Int64.Type, end = Int64.Type]),
            extractText = Table.AddColumn(tableOfIndexes, "extracted", each Text.Range(textFromFile, [start], [end] - [start]), type text)
        in extractText,
    SplitAndCleanIntoRecords = (textToParse as text) as record =>
        let
            splitOnVerticalDelimiter = Text.Split(textToParse, ";"),
            whitespaceTrimmed = List.Transform(splitOnVerticalDelimiter, Text.Trim),
            keyValuePairs = List.Transform(whitespaceTrimmed, each Text.Split(_, "=")),
            accumulatedRecords = List.Accumulate(keyValuePairs, [], (recordState, currentPair) => Record.AddField(recordState, currentPair{0}, currentPair{1}))
        in accumulatedRecords,
    recordColumn = Table.AddColumn(betweenEachSubAndEnd, "toExpand", each SplitAndCleanIntoRecords([extracted]), type record),
    expanded =
        let
            headersToExpand = List.Distinct(List.Combine(List.Transform(recordColumn[toExpand], Record.FieldNames))),
            removeOtherColumns = Table.SelectColumns(recordColumn, {"toExpand"}),
            expandColumn = Table.ExpandRecordColumn(removeOtherColumns, "toExpand", headersToExpand)
        in expandColumn          
in
    expanded

我从这样的文本开始:

Input text

并得到一个像这样的表:

Output table

方法2

let
    textFromFile = "<SUB    A=a; B=b; C=c; D=d <END <SUB    A=e; B=f; C=g; D=h <END <SUB    A=i; B=j; C=k; D=l <END A=m; B=n; B=o; C= null<END",
    replaceSubAndEnd = List.Accumulate({"<SUB", "<END"}, textFromFile, (textState as text, currentValueToReplace as text) as text => Text.Replace(textState, currentValueToReplace, ";")),
    SplitAndTrimEach = (textToSplit as text, delimiter as text) as list =>
        let
            split = Text.Split(textToSplit, delimiter),
            trimmed = List.Transform(split, Text.Trim)
        in trimmed,
    splitOnVerticalDelimiter = SplitAndTrimEach(replaceSubAndEnd, ";"),
    dropEmpty = List.Select(splitOnVerticalDelimiter, each _ <> ""),
    keyValuePairs = List.Transform(dropEmpty, each SplitAndTrimEach(_, "=")),
    toTable = Table.FromRows(keyValuePairs, {"header", "value"}),
    grouped = Table.Group(toTable, {"header"}, {{"columns", each _[value], type list}}),
    transformedTable = Table.FromColumns(grouped[columns], grouped[header])     
in
    transformedTable

(要实现,请打开查询编辑器,单击Advanced editor,复制并粘贴上面的代码。您需要调整代码以加载文本文件。)