Question

我有以下记录列表

来源：

a,yes
a,yes
b,No
c,N/A
c,N/A
c,N/A
d,xyz
d,abc
d,abc

输出：

a, Yes 2
b, No 1
c, N/A 3
d, xyz 1
d, abc 2

c, N/A "File is not correct"

在这里＆＃39;是＆＃39;和＆＃39;不＆＃39;是可接受的单词，如果任何其他单词数大于＆＃39;是＆＃39;或者＆＃39;否＆＃39;单个$ 1值的单词计数然后我们发出类似＆＃34的语句;文件不好＆＃34;

我尝试过以下脚本

awk -F, '{a[$1]++;}END{for (i in a)print i, a[i];}' filetest.txt

Answer 1

如果您不担心输出序列（与Input_file相同），那么以下内容可能对您有帮助。

awk -F, '{array[$1", "$2]++;} /yes/{y++;next} /No/{n++;next} /N\/A/{count++;next} END{;for(i in array){printf("%s %s%s\n",i,array[i],(count>y && count>n) && i ~ /N\/A/?RS i" File is not correct":"")}}'  Input_file

编辑：现在也添加非单行形式的解决方案。

awk -F, '{
array[$1", "$2]++;
}
/yes/{
  y++;
  next
}
/No/{
  n++;
  next
}
/N\/A/{
  count++;
  next
}
END{;
  for(i in array){
     printf("%s %s%s\n",i,array[i],(count>y && count>n) && i ~ /N\/A/?RS i" File is not correct":"")
}
}'  Input_file

EDIT2：根据OP N / A不应该硬编码，然后下面的代码将检查字符串yes的计数，字符串no的计数和第二个字段的其余部分的计数。然后它会将休息计数与是和否进行比较，根据它将根据OP的请求打印行。

awk -F, '{
array[$1", "$2]++;
}
/yes/{
  y++;
  next
}
/No/{
  n++;
  next
}
{
  count[$2]++;
}
END{
  for(i in count){
    val=val>count[i]?val:count[i]
};
  for(i in array){
    printf("%s %s%s\n",i,array[i],(val>y && val>n) &&(i !~ /yes/ && i !~ /No/)?RS i" File is not correct":"")
}
}'   Input_file

在运行上面的代码后，我正在关注。

./script.ksh
d, xyz 1
d, xyz File is not correct
c, N/A 3
c, N/A File is not correct
b, No 1
a, yes 2
d, abc 2
d, abc File is not correct

Answer 2

使用GNU awk实现真正的多维数组：

$ cat tst.awk
BEGIN { FS=","; OFS=", " }
{ cnt[$1][$2]++ }
END {
    for (key in cnt) {
        for (val in cnt[key]) {
            cur = cnt[key][val]
            print key, val " " cur
            if (tolower(val) ~ /^(yes|no)$/) {
                maxGood = (maxGood > cur ? maxGood : cur)
            }
            else {
                badCnt[key][val] = cur
            }
        }
    }

    print ""
    for (key in badCnt) {
        for (val in badCnt[key]) {
            if (badCnt[key][val] > maxGood) {
                print key, val " File is not correct"
            }
        }
    }
}

$ awk -f tst.awk file
a, yes 2
b, No 1
c, N/A 3
d, abc 2
d, xyz 1

c, N/A File is not correct

在其他地方使用tolower()或者如果您的$ 2数据确实可以是大写或小写，或者如果您的示例中只是一个错误，并且取决于您是否希望将其视为错误，则将其删除

输出将由in运算符随机提供 - 如果您愿意，可以轻松更改为任何其他顺序。

Answer 3

#!/bin/sh

FILE=1.txt

for r in `cat $FILE | sort | uniq`; do
count=`grep "$r" "$FILE" | wc -l | sed -e 's/^ *//'`
echo "$r $count";
done

如何使用unix

3 个答案: