Question

我正在制作一个脚本，从html标签之外的非html代码的目录中的.html文件中提取html代码。我希望输出覆盖源文件

到目前为止，这是我所拥有的，但我无法正常工作。

#!/bin/bash

for f in `ls .`; do
if [[ $f =~ \.html$ ]] 
then
    cat $f | tr "\n" "|" | grep -o '<html>.*</html>' | sed 's/|/\n/g' > $f
fi
done

Answer 1

#!/bin/bash

for f in `ls .`; do
if [[ $f =~ \.html$ ]] 
then
    cat $f | tr "\n" "|" | grep -o '<html>.*</html>' | sed 's/|/\n/g' > $f.temp
    mv $f.temp $f
fi
done

Answer 2

您可以将整个脚本替换为：

sed -i '/<[Hh][Tt][Mm][Ll]/,/<\/[Hh][Tt][Mm][Ll]/!d' *.html

或者，如果您不需要它不区分大小写：

sed -i '/<html/,/<\/html/!d' *.html

试图制作一个从文件中提取html代码的bash脚本

2 个答案: