grep中文本文件中的特殊字符和文字字符

时间:2018-04-01 22:38:42

标签: bash macos unix grep wildcard

我试图让grep服从通配符(.{64}.{65}),行首字符(^)和行尾字符({{1} })在文本文件中,忽略任何内容和其他内容。

foo.txt的内容:

$

bar.txt的内容:

^.{64}  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>$
^.{64}  /Users/1337/Test Hash Folder/C$
^.{64}  /Users/1337/Test Hash Folder/C [remain]$
^.{64}  /Users/1337/Test Hash Folder/D 日本$
^.{65}  /Users/1337/Test Hash Folder/\\F::$

这是我运行的命令:

\456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\F::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
f978dda2d3be7e976ec25eee3a17f24a02af7386d163ae95c1fa48cdf75586a5  /Users/1337/Test Hash Folder/A
9f913e331f16e9bc5493a7c4c9480753351fd0098398e32c9b8d4870a63b65ea  /Users/1337/Test Hash Folder/B [LOL].dmg
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

我希望它输出:

grep -Ef foo.txt bar.txt

但它输出了这个:

e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本
\456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\F::

以下列出了我的文件的确切名称:

14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

\\F// ^.$[E-frLOL[MAY[]{}()?<NUL> A B [LOL].dmg C C [remain] D 日本 是否有可能输出我需要的内容?如果没有,是否有其他方法(BBEdit / Notepad ++,Text Mechanic等)我可以用来达到同样的效果?

修改

grep行更改为:

...LOL[MAY...

我要做的是使用#;."'&,\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL> 转义所有违规字符,添加通配符等,将foo.txt反馈到sed,然后删除转义符,通配符{{1} } s和grep

所以,这是foo.txt的新内容:

^

我将运行这些以逃避违规字符:

$

我需要逃脱哪些其他角色?仅供参考,这些仅在/Users/1337/Test Hash Folder/#;."'&,\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL> /Users/1337/Test Hash Folder/C /Users/1337/Test Hash Folder/C [remain] /Users/1337/Test Hash Folder/D 日本 /Users/1337/Test Hash Folder/\\F:: 转义。

接下来,我将使用以下内容:

sed 's/\\/\\\\/g' foo.txt > baz.txt
sed -i '' 's/\$/\\\$/g' baz.txt
sed -i '' 's/\^/\\\^/g' baz.txt

然后我会跑:

grep

之后,我将删除cat baz.txt | grep '\\\\' > backslashes.txt cat baz.txt | grep -v '\\\\' > no_backslashes.txt sed 's/^/^.{64} /; s/$/$/' no_backslashes.txt > eggs.txt sed 's/^/^.{65} /; s/$/$/' backslashes.txt >> eggs.txt grep -Ef eggs.txt bar.txt ^.{64}(仅限结尾,以防止文件名记录被更改),反斜杠从baz.txt中删除。

如果其中任何一个令人困惑,请不要犹豫要求我澄清。

Mac OS X Yosemite,bash 3.2.57(1)-release,grep(BSD grep)2.5.1-FreeBSD

2 个答案:

答案 0 :(得分:2)

您的模式存在一些问题:

  1. 没有转义在正则表达式中具有特殊含义的某些字符
  2. 错误的重复模式计数
  3. 提议的解决方案

    ^.{64}  /Users/1337/Test Hash Folder/\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>$
    ^.{64}  /Users/1337/Test Hash Folder/C$
    ^.{64}  /Users/1337/Test Hash Folder/C \[remain\]$
    ^.{64}  /Users/1337/Test Hash Folder/D 日本$
    ^[\].{64}  /Users/1337/Test Hash Folder/\\{4}F::$
    

    转义字符类字符和量词(^$[]?)并正确设置重复计数。

答案 1 :(得分:0)

这是巨大的,但这是我为了让它发挥作用所做的:

foo.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/\\\\\\<-6 F 2->::
/Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C [remain]
/Users/1337/Test Hash Folder/D 日本

bar.txt的内容:

\d4c88a749dcb8d8c09fd8e08e044f66d8d1d9cf9d191c62989697813d26a6b55  /Users/1337/Test Hash Folder/#;."'&,\\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
1ef8a2fb20e09e1866768d824289ffdbda12e565839fa7dda1fe12c4206e5759  /Users/1337/Test Hash Folder/@ss
\456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\\\\\\\\\<-6 F 2->::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
d96b11b464adc478e458943d080b5efac2e887a18ef4ae785d21506de5594ddc  /Users/1337/Test Hash Folder/A
9f913e331f16e9bc5493a7c4c9480753351fd0098398e32c9b8d4870a63b65ea  /Users/1337/Test Hash Folder/B [LOL].dmg
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

命令运行:

$ sed 's/\\/\\\\/g' foo.txt > baz.txt
$ sed -i '' 's/\$/\\\$/g' baz.txt
$ sed -i '' 's/\^/\\\^/g' baz.txt
$ sed -i '' 's/\*/\\\*/g' baz.txt
$ sed -i '' 's/\?/\\\?/g' baz.txt
$ sed -i '' 's/\[/\\\[/g' baz.txt
$ sed -i '' 's/\]/\\\]/g' baz.txt
$ sed -i '' 's/(/\\\(/g' baz.txt
$ sed -i '' 's/)/\\\)/g' baz.txt
$ sed -i '' 's/}/\\\}/g' baz.txt
$ sed -i '' 's/{/\\\{/g' baz.txt
$ sed -i '' 's/\\\\/\\\\\{2}/g' baz.txt

$ cat foo.txt | grep '\\\\' > backslashes.txt
$ cat foo.txt | grep -v '\\\\' > no_backslashes.txt

$ sed 's/^/^.{64}  /; s/$/$/' no_backslashes.txt > qux.txt
$ sed 's/^/^.{65}  /; s/$/$/' backslashes.txt >> qux.txt

baz.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,\\{2}:`!\*\?\$\(\){}\[\]<>|-=+% ~\^\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>
/Users/1337/Test Hash Folder/\\{2}\\{2}\\{2}\\{2}\\{2}\\{2}<-6 F 2->::
/Users/1337/Test Hash Folder/\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C \[remain\]
/Users/1337/Test Hash Folder/D 日本

backslashes.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,\\{2}:`!\*\?\$\(\){}\[\]<>|-=+% ~\^\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>
/Users/1337/Test Hash Folder/\\{2}\\{2}\\{2}\\{2}\\{2}\\{2}<-6 F 2->::

no_backslashes的内容:

/Users/1337/Test Hash Folder/\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C \[remain\]
/Users/1337/Test Hash Folder/D 日本

qux.txt的内容:

^.{64}  /Users/1337/Test Hash Folder/\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>$
^.{64}  /Users/1337/Test Hash Folder/C$
^.{64}  /Users/1337/Test Hash Folder/C \[remain\]$
^.{64}  /Users/1337/Test Hash Folder/D 日本$
^.{65}  /Users/1337/Test Hash Folder/#;."'&,\\{2}:`!\*\?\$\(\){}\[\]<>|-=+% ~\^\^.\$\[E-frLOL\[MAY\[\]{}\(\)\?<NUL>$
^.{65}  /Users/1337/Test Hash Folder/\\{2}\\{2}\\{2}\\{2}\\{2}\\{2}<-6 F 2->::$

现在,最后,我们一直在等待的命令:

grep -Ef qux.txt bar.txt

哪个输出这些可爱的字符串:

\d4c88a749dcb8d8c09fd8e08e044f66d8d1d9cf9d191c62989697813d26a6b55  /Users/1337/Test Hash Folder/#;."'&,\\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
\456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\\\\\\\\\<-6 F 2->::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

谢谢大家的帮助!