Question

在两行文本之间搜索和删除数据的最佳方法是什么，包括第一行而不是第二行。

字符串1：SECTION - PAY 500 - 要删除

要删除的数据，随机文字行

字符串2：SECTION - Pay 400 - 保持

这是大约3000页的word文档，但我也有一个文本版本可供使用。我将从哪里开始为这样的任务编写bash脚本？

文件内容示例：

text 
SECTION - PAY 500    (to be deleted)
text                 (to be deleted)
SECTION - Pay 400
text 
SECTION - PAY 500    (to be deleted)
text                 (to be deleted)
SECTION - Pay 400
text

删除后，这应该是结果

text 
SECTION - Pay 400
text
SECTION - Pay 400
text

Answer 1

使用标准sed：

的解决方案

sed "/$START/,/$END/ { /$END/"'!'" d; }"

这意味着，对于从/$START/开始到/$END/行动{ /$END/! d; }结束的范围，将对所有非{0}}行进行d（删除） /$END/。

"'!'"只是奇怪，但是从bash扩展中逃脱!符号的唯一方法。

Answer 2

我认为你可以很快地逐行解析文件。你正在努力实现的目标似乎并不太复杂。

copy=true
while read line; do
    if [ $copy ]; then
        if [[ "$line" == "SECTION - PAY 500"* ]]; then copy=; continue; fi
        echo "$line" >> outputfile
    else
        if [[ "$line" == "SECTION - Pay 400"* ]]; then copy=true; fi
    fi
done < inputfile

通过这样做，我们现在甚至可以使用一台小型图灵机！

Answer 3

另一种（不那么奇怪;））标准的sed解决方案： sed "/$END/ p; /$START/,/$END/ d;"

附注：如果需要，某些sed版本还支持文件的就地编辑。

一个完整的bash脚本：

#! /bin/bash

if [ "x$1" = "x-r" ]
then
    regex=1
    shift
else
    regex=0
fi

if [ $# -lt 2 ]
then
    echo "Usage: del.sh [-r] start end"
    exit 1
fi

start="$1"
end="$2"

function matches
{
    [[ ( regex -eq 1 && "$1" =~ $2 ) || ( regex -eq 0 && "$1" == "$2" ) ]]
}

del=0
while read line
do
    # end marker, must be printed
    if matches "$line" "$end"
    then
        del=0
    fi
    # start marker, must be deleted
    if matches "$line" "$start"
    then
        del=1
    fi
    if [ $del -eq 0 ]
    then
        echo "$line"
    fi
done

Answer 4

简单解决方案：尝试这种方式

<强> Inputfile.txt

text 
SECTION - PAY 500    
text                 
SECTION - Pay 400
text 
SECTION - PAY 500   
text                 
SECTION - Pay 400
text

<强>代码

awk '/500/{print;getline;next}1' Inputfile.txt | sed '/500/d'

<强>输出

text 
SECTION - Pay 400
text 
SECTION - Pay 400
text

删除两行之间的数据

4 个答案: