在bash中以相反的顺序读取和存储数据

时间:2016-05-20 05:33:07

标签: bash awk sed

我已将2列数据复制到文件中。由于my_date的群集密钥设置为按降序返回

     echo "copy home.admin (id,my_date) to 'myOutputFile';" > copyInputs.cql

myOutputFile -

     TEST1,2015-01-01 15:00:00+0000
     TEST1,2014-09-04 14:00:00+0000
     4.VOD,2015-08-18 04:00:00+0000
     4.VOD,2015-06-26 04:00:00+0000
     4.VOD,2015-05-13 04:00:00+0000
     000TEST8,2015-11-19 05:00:00+0000

第一列是id,第二列是my_date。我想以相反的顺序读取每个id的数据。所以输出应该是这样的 -

     TEST1,2014-09-04 14:00:00+0000
     TEST1,2015-01-01 15:00:00+0000
     4.VOD,2015-05-13 04:00:00+0000
     4.VOD,2015-06-26 04:00:00+0000
     4.VOD,2015-08-18 04:00:00+0000
     000TEST8,2015-11-19 05:00:00+0000

获取此输出后,我正在准备一个更新语句以填充一个新列my_rev.my_rev将从100开始为eaach id并递增,直到找到新的id。

    update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
    update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
    update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
    update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
    update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';

有什么建议吗?

2 个答案:

答案 0 :(得分:2)

  

我想以相反的顺序读取每个id

的数据

以相反的顺序打印每个id

$ awk -F, '$1==prev {s=$0 "\n" s; next} { printf "%s",s; s=$0 "\n"; prev=$1} END{printf "%s",s}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000

工作原理:

此脚本使用两个变量。 prev包含上一行的ID。 s以相反的顺序包含最新ID的行。

  • -F,

    这告诉awk使用逗号作为字段分隔符。

  • $1==prev {s=$0 "\n" s; next}

    对于具有相同ID的行(字段1,表示为$1),这会将新行添加到变量s的开头。其余命令被跳过,awk跳转到next行。

  • printf "%s",s; s=$0 "\n"; prev=$1

    如果我们到这里,我们将开始一个新的ID。在这种情况下,我们会从之前的ID中打印s中保存的行。我们使用当前行更新s,然后将prev设置为当前ID

  • END{printf "%s",s}

    我们到达文件末尾后,打印s作为最后一个ID。

替代

如果您想进行更复杂的重新排序,则会针对每个sort调用id,并且具有所有灵活性,并保持每个id的原始顺序:< / p>

$ awk -F, -v s=sort '$1==prev {print | s; next} {close(s); print | s; prev=$1}' infile
TEST1,2014-09-04 14:00:00+0000
TEST1,2015-01-01 15:00:00+0000
4.VOD,2015-05-13 04:00:00+0000
4.VOD,2015-06-26 04:00:00+0000
4.VOD,2015-08-18 04:00:00+0000
000TEST8,2015-11-19 05:00:00+0000

重新格式化

如果outfile包含上面排序命令的输出,则运行:

$ awk -F, '{if ($1==prev)n++; else n=100; prev=$1; printf "update home.admin my_rev =%i where id = '\''%s'\'' and my_date = '\''%s'\'';\n",n,$1,$2}' outfile
update home.admin my_rev =100 where id = 'TEST1' and my_date = '2014-09-04 14:00:00+0000';
update home.admin my_rev =101 where id = 'TEST1' and my_date = '2015-01-01 15:00:00+0000';
update home.admin my_rev =100 where id = '4.VOD' and my_date = '2015-05-13 04:00:00+0000';
update home.admin my_rev =101 where id = '4.VOD' and my_date = '2015-06-26 04:00:00+0000';
update home.admin my_rev =102 where id = '4.VOD' and my_date = '2015-08-18 04:00:00+0000';
update home.admin my_rev =100 where id = '000TEST8' and my_date = '2015-11-19 05:00:00+0000';

答案 1 :(得分:1)

sort应该做的伎俩

sort -r -t, -k1,2 infile

通常,您需要的唯一选项是-r