Question

下面是我在数组中具有的值

10.106.86.93,A1,3
10.106.86.93,A2,3
10.106.86.93,A2,3
10.106.86.93,A3,3
10.106.86.93,A3,3
10.106.86.93,A4,3

需要循环遍历，如果最后一个值等于3，则必须合并第二列的值

例如

10.106.86.93  A1,A2,A2,A3,A3,A4  3

正在尝试一些for循环，但使用不正确

while read -r line
do
    StatusValue= $line | awk -F, '{print $NF}'
    if [[${StatusValue} == "3"}]] then
       echo $line | awk -F,'{print $2}'
    fi

done <<< ${Dell_Data_Status_3}

在这里，我尝试在状态为3时打印该行的第二个值

但无法获得输出。

错误：

./SortFile.sh: line 30: syntax error near unexpected token `fi'
./SortFile.sh: line 30: `    fi'

请让我知道这里出什么问题

Answer 1

让我们从简单的bash语法开始：

下面是我在数组中具有的值

好，所以我们有一个bash数组：

arr=(
10.106.86.93,A1,3
10.106.86.93,A2,3
10.106.86.93,A2,3
10.106.86.93,A3,3
10.106.86.93,A3,3
10.106.86.93,A4,3
)

需要循环浏览

好的。首先，我们需要将数组输出为换行符分隔列表。以下将输出数组：

$ printf "%s\n" "${arr[@]}"

然后，我们需要读取数组元素并在逗号分隔符上进行拆分。我们使用IFS变量来控制bash在哪些字符上分割元素：

printf "%s\n" "${arr[@]}" |
while IFS=, read -r ip A num; do
     : # TODO body
done

好的。现在，我们可以检查第三列的值，如果匹配3，则输出第三列：

printf "%s\n" "${arr[@]}" |
while IFS=, read -r ip A num; do
     if [ "$num" = 3 ]; then
          echo "$A"
     fi
done

请注意，每个空格都很重要。代码中的if [[${StatusValue} == "3"}]] then是非常无效的-您需要在[[和${..之间留一个空格，并在"3"和]]之间留一个空格，{{1} } 是无效的。请记住，您使用键盘与计算机对话，仅此而已-每次击键都很重要。

现在最困难的部分：

如果最后一个值等于3，则必须合并第二列的值

好吧，这是通过}脚本简单快速地完成的。我们需要做的是创建地图。我们需要将第三列的值映射到其他两列。

但是让我们做一个简单，愚蠢，非常非常缓慢的方法：

在第三列中标识唯一值
对于第三列中的每个唯一值
1. 获取具有该值的所有行作为第三列
2. 从任何一行中获取第一列
3. 从过滤后的行中提取第二列并将其连接
4. 输出一行

awk

请检查shellcheck.net上的脚本是否存在错误。大多数初学者的错误（缺少引号，错别字，错误的重定向，# just to have the array as a string arrstr=$(printf "%s\n" "${arr[@]}") # cut only the third column and get unique values <<<"$arrstr" cut -d, -f3 | sort -u | # for each unique third column value while read -r num; do # get the columns that have that value as the third column filtered=$(<<<"$arrstr" awk -vIFS=, -vOFS=, '$3 = '"$num") # get the values of the second field only # substitute newline for comma # remove the trailing comma second_field_sum=$(<<<"$filtered" cut -d, -f2 | tr '\n' ',' | sed 's/,$//') # get the value of the first field (from the first line) ip=$(<<<"$filtered" head -n1 | cut -d, -f1) # output printf "%s %s %s\n" "$ip" "$second_field_sum" "$num" done语法错误）仅通过侦听shellcheck消息即可轻松解决。

Answer 2

从@KamilCuk借些东西，用++偿还债务，谢谢：

$ arr=(
10.106.86.93,A1,3
10.106.86.93,A2,3
10.106.86.93,A2,3
10.106.86.93,A3,3
10.106.86.93,A3,3
10.106.86.93,A4,3
)

使用awk进行处理：

$ printf "%s\n" "${arr[@]}" | 
awk -F, '                          # input separator to a comma
$3==3 {                            # when the third field is 3
    f2=$2=f2 (f2==""?"":",") $2    # update the $2 to 2nd field var f2 and
    out=$0                         # ... keep printable record in out var
}
END { print out }'                 # output here

输出：

10.106.86.93 A1,A2,A2,A3,A3,A4 3

当然，数据可以在文件而不是数组中。

更新：

$ printf "%s\n" "${arr[@]}" | 
awk -F, -v OFS=, '                 # input and output separators to a comma
$3==3 {                            # when the third field is 3
    f2=$2=f2 (f2==""?"":",") $2    # update the $2 to 2nd field var f2 and
    out=$0                         # ... keep printable record in out var
}
END { print out }'                 # output here

Answer 3

您可以使用以下脚本在纯bash（和GNU coreutils）中进行操作：

#! /bin/bash
set -euo pipefail

# https://stackoverflow.com/questions/1527049
function join_by { local IFS="$1"; shift; echo "$*"; }

Dell_Data_Status_3="$(cat data)" # I made a standalone script and assume
                                 # this variable contains the raw data

# Get the list of first column elements, sorted and deduplicated
readarray -t first_col <<<"$(echo "${Dell_Data_Status_3}" | cut -d ',' -f 1 | sort -u)"

for i in "${first_col[@]}" ; do
    # get the values of the second column for every first column element
    readarray -t second_col <<<"$(echo "${Dell_Data_Status_3}" | grep "$i" | cut -d ',' -f 2)"

    # print the data. If you need another value than 3 at the end,
    # you may want to consider a loop on this value
    echo "$i $(join_by ',' "${second_col[@]}") 3"
done

遍历数组并基于一个列值，连接另一列值

3 个答案: