计算at文本每一行中的特定字符,并在特定位置删除该字符,直到该字符具有特定计数

时间:2018-08-27 12:53:13

标签: bash

您好,我需要有关Solaris系统上一个脚本的帮助:

我将分析该脚本:

我有这些文件:

i)

cat /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt

    201807300000000004 
    201807300000000005 
    201807300000000006
    201807300000000007
    201807300000000008
    201807200002056422
    201807230003099849
    201807230003958306
    201806290003097219
    201806080001062012
    201806110001633519
    201806110001675603

ii)

cat /tmp/BadTransactions/test_data_for_validation_script.txt

20180720|201807200002056422||57413620344272|030341-213T |580463|WIRE||EUR|EUR|20180720|20180720|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||574000|129|||||||||||||||||||||||||31313001103712|BFNJKL|K| I P P BONNIER PUBLICATIO|||FI|PERS7
20180723|201807230003099849||57100440165173|140197-216U|593619|WIRE||EUR|EUR|20180723|20180723|||||||00000000000060.00|00000000000060.00|Debit||||||||||MPA|||571004|106|||||||||||||||||||||||||57108320141339|Ura Basket / UraNaiset|||-div|||FI|PERS
20180723|201807230003958306||57206820079775|210489-0788|593619|WIRE||EUR|EUR|20180721|20180723|||||||00000000000046.00|00000000000046.00|Debit||||||||||MPA|||578800|106|||||||||||||||||||||||||18053000009026|IC Kodit||| c/o Newsec Asset Manag|||FI|PERS
20180629|201806290003097219||57206820079775|210489-0788|593619|WIRE||EUR|EUR|20180628|20180629|||||||00000000000856.00|00000000000856.00|Debit||||||||||MPA|||578800|106|||||||||||||||||||||||||18053000009018|IC Kodit||| c/o Newsec Asset Manag|||FI|PERS
20180608|201806080001062012||57206820079441|140197-216S|580463|WIRE||EUR|EUR|20180608|20180608|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||541002|129|||||||||||||||||||||||||57108320141339|N FN|K| IKI I P BONNIER PUBLICATION|||FI|PERS7 
20180611|201806110001633519||57206820079525|140197-216B|593619|WIRE||EUR|EUR|20180611|20180611|||||||00000000000242.10|00000000000242.10|Debit||||||||||MPA|||535806|106|||||||||||||||||||||||||57108320141339|As Oy Haikkoonsilta|| mannerheimin|||FI|PERS9
20180611|201806110001675603||57206820079092|140197-216Z|580463|WIRE||EUR|EUR|20180611|20180611|||||||00000000000019.90|00000000000019.90|Debit||||||||||MPA|||536501|129|||||||||||||||||||||||||57108320141339|N ^NLKL|K| I P NJ BONNIER PUBLICAT|||FI|PERS7

脚本必须检查

的每一行

/tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt以及字符串是否在

/tmp/BadTransactions/test_data_for_validation_script.txt将会创建一个

新文件`/tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt

将从该新文件开始,计算每行中的所有“ |”,如果超过64,则将删除该行的第61个“ |”。这将一直持续到其生产线有64个管道为止。

例如,如果一行有67“ |”,它将删除61号,然后再次对其进行检查,现在有66“ ||,因此它将删除61” ||。 “,等等...直到达到64个管道。因此,所有线路都必须有64个管道。 “。

这是我的代码,但是在此代码中,我设法仅删除每行中的第61个管道,我无法进行循环,因此它将检查每行直到到达64个管道。

如果您能帮助我,我将不胜感激。

#!/bin/bash
PATH=/usr/xpg4/bin:/bin:/usr/bin

while read line
do

grep "$line" /tmp/BadTransactions/test_data_for_validation_script.txt

awk 'NR==FNR { K[$1]; next } ($2 in K)' /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt FS="|" /opt/NorkomC
onfigS2/inbox/TRANSACTIONS_DAILY_20180730.txt > /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt

sed '/\([^|]*[|]\)\{65\}/ s/|//61' /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt

done < /tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt > /tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_201807
30.txt

1 个答案:

答案 0 :(得分:2)

好吧,在这个问题上,您有几段代码。

  • 您需要逐行读取文件
  • 根据另一文件检查每一行
  • 检查匹配行中是否出现“ |”
  • 递归删除第61个“ |”直到该字符串将保留其中的64个

您可以这样做

#!/bin/bash
count() { ### We will use this to count how many pipes are there
  string="${1}"; shift
  char="${1}"
  printf "%s" "${string}" | grep -o -e "${char}" | grep -c .
}
file1="/tmp/BadTransactions/TRANSACTIONS_DAILY_20180730.txt" ### File to read
file2="/tmp/BadTransactions/test_data_for_validation_script.txt" ### File to check for duplicates
file3="/tmp/BadTransactions/TRANSACTIONS_DAILY_NEW_20180730.txt" ### File where to save our final work
printf "" > "${file3}" ### Delete (eventual) history
exec 3<"${file1}" ### Put our data in file descriptor 3
while read -r line <&3; do ### read each line and put it in var "$line"
  string="$(grep -e "${line}" "${file2}")" ### Check the line against second file
  while [ "$(count "${string}" "|")" -gt 64 ]; do ### While we have more than 64 "|"
    string="$(printf "%s" "${string}" | sed -e "s/|//61")" ### Delete the 61st occurrence
  done
  printf "%s" "${string}" >> "${file3}" ### Save the correct line in the third file
done
exec 3>&- ### Clean file descriptor 3

这未经测试,但可以正常工作。

请注意,我认为grep只会从第二个文件中返回一次,这是理所当然的... 如果不是您的情况,则必须手动检查每个值,例如:

for value in $(grep -e "${line}" "${file2}"); do
  ...
done

编辑: 对于像Solaris或其他未安装GNU grep的系统,您可以按以下方法替换count方法:

count() {
  string="${1}"; shift
  char="${1}"
  printf "%s" "${string}" | awk -F"${char}" '{print NF-1}'
}