Question

我正在尝试清除文件中的重复项。内容是数字和名称，名称可以是（例如重复名称）：ABC ABCxxyy ABC123 ABClmn等...（所以这里我只想在我的文件中使用ABC）。为此，我编写了以下代码。目前它使用文件读/写。我想使用数组更改此代码，但无法计算。

以下是当前代码：

for h in `cat name.list`
do
count=`grep -c $h name.list`
if (( $count >= 1 ))
then
    echo $h >> name.list.new            #building the new list
    grep -v $h name.list > name.list.tmpcopy    #rebuilding the name.list file...
    mv name.list.tmpcopy name.list
fi
done

我试过，但是我得到了与输出相同的原始列表：

while read line
do
    array+=("$line")
done < name.list

#loop thru the array:...
for ((i=0; i < ${#array[*]}; i++))
do
    h=${array[i]}
    match=$(echo "${array[@]:0}" | tr " " "\n" | grep -c $h)
    if (( $match >= 1 ))
    then
        # remove all matched names from array..... Longest match from front of string(s)
        array=${array[@]##$h}

        #save the current name to new array
        array3[${#array3[*]}]=$h
    fi
done

for ELEMENT in "${array3[@]}"
do
 echo $ELEMENT
done > name.list.new

Answer 1

试试这个：

declare -a names=( $(<name.list) )

len=${#names[@]}

for i in $(seq 0 $len); do
  if [ "${names[$i]}" != "" ]; then
    m=${names[$i]}
    for j in $(seq 0 $len); do
      if [ $i -ne $j ]; then
        if [ "$m" == "${names[$j]:0:${#m}}" ]; then
          unset names[$j]
        fi
      fi
    done
  fi
done

for name in "${names[@]}"; do
  echo $name
done > name.list.new

<强>步骤一步：

代码首先声明一个数组

declare -a names=( ... )

并将name.list的内容读入其中：

$(<name.list)

然后迭代遍历数组的所有索引：

for i in $(seq 0 $len); do
  ...
done

作为安全警卫，空白字段被跳过：

  if [ "${names[$i]}" != "" ]; then
    ...
  fi

非空字段被读入变量$m（为方便起见）

    m=${names[$i]}

然后内部循环遍历数组的所有索引，除了当前在外部循环中处理的索引（$i）：

    for j in $(seq 0 $len); do
      if [ $i -ne $j ]; then
        ...
      fi
    done

如果索引$m字段的第一个长度 - $j 字符与该字段被删除的$m相同：

        if [ "$m" == "${names[$j]:0:${#m}}" ]; then
          unset names[$j]
        fi

最后，剩余的值将写入输出文件：

for name in "${names[@]}"; do
  echo $name
done > name.list.new

从bash数组中删除重复项并保存到文件中

1 个答案: