如何删除标题,新行字符\和“”

时间:2014-03-13 04:22:16

标签: linux bash shell unix vim

我的数据文件为:

"S.ACQUIRER||'|'||SUBSTR(S.ACQ_COUNTRY,1,4)||'|'||SUBSTR(S.ACQ_CURRENCY_CODE,1,5)||'|'||S.PAN||'|'||SUBSTR(S.ACCTNUM,1,18)||'|'||SU\    BSTR(I.E_NAME,1,35)||'|'||S.LOCAL_DATE||'|'||S.LOCAL_TIME||'|'||DECODE(S.PCODE,0,'POSTRANSACTIONFROMDEFAULTACCOUNT',1000,'POS"
"9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\
Savings Account |10|Approved |2000061|ATM Test Terminal Bang |123400000123456 |01001101"
"9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\
Savings Account |10|10|4000061|ATM Test Terminal Bang |123450000000456 |01001101"

但是,预期的输出是:

9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from Savings Account |10|Approved |2000061|ATM Test Terminal Bang |123400000123456 |01001101
9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from Savings Account |10|10|4000061|ATM Test Terminal Bang |123450000000456 |01001101

区别在于:

  1. 应该没有标题行
  2. 不应该""在开始每一行和结束时
  3. 转义的新行字符(反斜杠后跟换行符)不应出现
  4. 如何满足我的要求?

3 个答案:

答案 0 :(得分:2)

sed -e '/\\$/N' \
    -e 's/\\\n/ /g' \
    -e 's/^"//' \
    -e 's/"$//' \
    -e '/^[^0-9]/d' \
    "$@"

这可能会被压成一条不可读的线,但是当它们整齐地分开时,更容易解释这五个操作:

  1. 如果该行以反斜杠结尾,请将下一行连接到缓冲区(模式空间)并重新启动。
  2. 用空格替换任何反斜杠换行符。
  3. 删除行首的双引号。
  4. 删除行尾的双引号。
  5. 删除任何不以数字开头的行。
  6. 给定一个干净的输入版本(没有尾随空白),这会产生:

    9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from Savings Account |10|Approved |2000061|ATM Test Terminal Bang |123400000123456 |01001101
    9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from Savings Account |10|10|4000061|ATM Test Terminal Bang |123450000000456 |01001101
    

答案 1 :(得分:1)

这应该可以解决问题:

awk '/\\$/&&NR>2{sub(/\"/,"");printf $0;next}NR>2{sub(/\"/,"");print}' file

<强>输出:

$ cat file
"S.ACQUIRER||'|'||SUBSTR(S.ACQ_COUNTRY,1,4)||'|'||SUBSTR(S.ACQ_CURRENCY_CODE,1,5)||'|'||S.PAN||'|'||SUBSTR(S.ACCTNUM,1,18)||'|'||SU\
BSTR(I.E_NAME,1,35)||'|'||S.LOCAL_DATE||'|'||S.LOCAL_TIME||'|'||DECODE(S.PCODE,0,'POSTRANSACTIONFROMDEFAULTACCOUNT',1000,'POS"
"9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\
Savings Account |10|Approved |2000061|ATM Test Terminal Bang |123400000123456 |01001101"
"9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\
Savings Account |10|10|4000061|ATM Test Terminal Bang |123450000000456 |01001101"

$ awk '/\\$/&&NR>2{sub(/\"/,"");printf $0;next}NR>2{sub(/\"/,"");print}' file
9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\Savings Account |10|Approved |2000061|ATM Test Terminal Bang |123400000123456 |01001101
9000000007|840|840|5048349120900000008|504834000000006028|Ecustomer name |03-JAN-14|115744|Cash Withdrawal from\Savings Account |10|10|4000061|ATM Test Terminal Bang |123450000000456 |01001101

答案 2 :(得分:0)

在vim中打开它,执行此

:%s/^"//g

:%s/"$//g

:%s/\\//g

但我不知道如何识别标题