我有这些文件
NotRequired.txt
(具有需要删除的行)Need2CleanSED.txt
(大文件,需要清除)Need2CleanGRP.txt
(大文件,需要清除)内容:
more NotRequired.txt
[abc-xyz_pqr-pe2_123]
[lon-abc-tkt_1202]
[wat-7600-1_414]
[indo-pak_isu-5_761]
我正在阅读上面的文件,并希望通过SED和GREP尝试从Need2Clean???.txt
中删除行,但没有成功。
myFile="NotRequired.txt"
while IFS= read -r HKline
do
sed -i '/$HKline/d' Need2CleanSED.txt
done < "$myFile"
myFile="NotRequired.txt"
while IFS= read -r HKline
do
grep -vE \"$HKline\" Need2CleanGRP.txt > Need2CleanGRP.txt
done < "$myFile"
看起来好像变量和字符[]出了问题。
答案 0 :(得分:3)
What you're doing is extremely inefficient and error prone. Just do this:
grep -vF -f NotRequired.txt Need2CleanGRP.txt > tmp &&
mv tmp Need2CleanGRP.txt
Thanks to grep -F
the above treats each line of NotRequired.txt as a string rather than a regexp so you don't have to worry about escaping RE metachars like [
and you don't need to wrap it in a shell loop - that one command will remove all undesirable lines in one execution of grep
.
Never do command file > file
btw as the shell might decide to execute the > file
first and so empty file
before command
gets a chance to read it! Always do command file > tmp && mv tmp file
instead.
答案 1 :(得分:0)
您的假设是正确的。 [...]
构造会查找该集合中的任何字符,因此您必须以\
开头(“转义”)它们。最简单的方法是在原始文件中执行此操作:
sed -i -e 's:\[:\\[:' -e 's:\]:\\]:' "${myFile}"
如果您不喜欢这样做,可以将sed
命令放在要将文件定向到的位置:
done < replace.txt|sed -e 's:\[:\\[:' -e 's:\]:\\]:'
最后,您可以在每个HKline
变量上使用sed:
HKline=$( echo $HKline | sed -e 's:\[:\\[:' -e 's:\]:\\]:' )
答案 2 :(得分:0)
try gnu sed:
sed -Ez 's/\n/\|/g;s!\[!\\[!g;s!\]!\\]!g; s!(.*).!/\1/d!' NotRequired.txt| sed -Ef - Need2CleanSED.txt
Two sed process are chained into one by shell pipe
NotRequired.txt
is 'slurped' by sed -z
all at once and substituted its \n
and [
meta-char with |
and \[
respectively of which the 2nd process uses it as regex script for the input file, ie. Need2CleanSED.txt. 1st process output;
/\[abc-xyz_pqr-pe2_123\]|\[lon-abc-tkt_1202\]|\[wat-7600-1_414\]|\[indo-pak_isu-5_761\]/d
add -u
ie. unbuffered, option to evade from batch process, sort of direct i/o