Question

下午好，

我正在尝试制作一个清除一些数据输出文件的bash脚本。文件如下所示：

/path/
/path/to
/path/to/keep
/another/
/another/path/
/another/path/to
/another/path/to/keep

我想最终得到这个：

/path/to/keep
/another/path/to/keep

我想循环浏览文件的行，检查下一行以查看它是否包含当前行，如果是，则从文件中删除当前行。这是我的代码：

for LINE in $(cat bbutters_data2.txt)
do
    grep -A1 ${LINE} bbutters_data2.txt
    if [ $? -eq 0 ]
    then
       sed -i '/${LINE}/d' ./bbutters_data2.txt
    fi
done

Answer 1

假设您的输入文件按照您显示的方式排序：

$ awk 'NR>1 && substr($0,1,length(last))!=last {print last;} {last=$0;} END{print last}' file
/path/to/keep
/another/path/to/keep

如何运作

awk逐行读取输入文件。每当我们读到一个新行时，我们会将它与最后一行进行比较。如果新行不包含最后一行，那么我们打印最后一行。更详细：

NR>1 && substr($0,1,length(last))!=last {print last;}

如果这不是第一行，并且最后一行（名为last）未包含在当前行$0中，则打印最后一行。
last=$0

将变量last更新为当前行。
END{print last}

读完文件后，打印最后一行。

Answer 2

我喜欢awk解决方案，但bash本身可以处理任务。 注意：解决方案（awk和bash）要求按升序列出较少包含的路径。这是一个替代的bash解决方案（bash仅由于glob匹配操作）：

#!/bin/bash

fn="${1:-/dev/stdin}"   ## accept filename or stdin

[ -r "$fn" ] || {       ## validate file is readable
    printf "error: file not found: '%s'\n" "$fn"
    exit 1
}

declare -i cnt=0        ## flag for 1st iteration

while read -r line; do  ## for each line in file

    ## if 1st iteration, fill 'last', increment 'cnt', continue
    [ $cnt -eq 0 ] && { last="$line"; ((cnt++)); continue; }

    ## while 'line' is a child of 'last', continue, else print
    [[ $line = "${last%/}"/* ]] || printf "%s\n" "$last"

    last="$line"        ## update last=$line
done <"$fn"

[ ${#line} -eq 0 ] &&   ## print last line (updated for non POSIX line end)
    printf "%s\n" "$last" || 
    printf "%s\n" "$line"

exit 0

<强>输出

$ bash path_uniql.sh < dat/incpaths.txt
/path/to/keep
/another/path/to/keep

用于删除冗余行的Bash脚本

2 个答案:

如何运作