将CSV文件转换为多行文本文件

时间:2019-05-27 23:29:43

标签: bash

我有一个如下文件:

C_DocType_ID,SOReference,DocumentNo,ProductValue,Quantity,LineDescription,C_Tax_ID,TaxAmt

1000000,1904093563U,1904093563U,5210-1,1,0,1000000,0 1000000,1904093563U,1904093563U,6511,2,0,1000000,0 1000000,1904093563U,1904093563U,5001,1,0,1000000,0 1000000,1904083291U,1904083291U,5310,4,0,1000000,0 1000000,1904083291U,1904083291U,5311,3,0,1000000,0 1000000,1904083291U,1904083291U,6101,6,0,1000000,0 1000000,1904083291U,1904083291U,6102,1,0,1000000,0 1000000,1904083291U,1904083291U,6106,6,0,1000000,0

我需要将其转换为如下所示的文本文件:

WOH~1.0~~1904093563Utest~~~ORD~~~~
WOL~~~5210-1~~~~~~~~1~~~~~~~~~~~~~~~~~~~~~
WOL~~~6511~~~~~~~~2~~~~~~~~~~~~~~~~~~~~~
WOL~~~5001~~~~~~~~1~~~~~~~~~~~~~~~~~~~~~

WOH~1.0~~1904083291Utest~~~ORD~~~~~~
WOL~~~5310~~~~~~~~4~~~~~~~~~~~~~~~~~~~~~
WOL~~~5311~~~~~~~~3~~~~~~~~~~~~~~~~~~~~~
WOL~~~6101~~~~~~~~6~~~~~~~~~~~~~~~~~~~~~
WOL~~~6102~~~~~~~~1~~~~~~~~~~~~~~~~~~~~~
WOL~~~6106~~~~~~~~6~~~~~~~~~~~~~~~~~~~~~

输出文件具有标题记录和行项目记录。标头记录包含SOReference和一些硬编码字段,而行项目记录包含与该SOReference关联的产品价值和数量。在输入文件中,我们有2个唯一的SOReferences,这就是为什么输出文件包含2个标题记录及其关联的行项目记录的原因。

是否需要以命令行方式进行某些操作(awk / sed)?因为我有一系列这样的文件,需要将其转换为文本。

1 个答案:

答案 0 :(得分:1)

使用AWK,请尝试以下操作:

awk -F, '
FNR==1 {next}       # skip the header line
{
    if ($2 != prevcol2) {           # insert newline when SOReference changes
        nl = FNR<=2 ? "" : "\n"     # suppress the newline in the 1st line
        printf("%sWOH~1.0~~%stest~~~ORD~~~~\n", nl, $2)
    }
    printf("WOL~~~%s~~~~~~~~%s~~~~~~~~~~~~~~~~~~~~~\n", $4, $5)
    prevcol2 = $2
}' file.csv