Question

我有一个.txt文件

ID Number        Name                         Fed Sex Tit  Wtit
4564             A B M Yusop, Tapan           BAN M
59841212         A Rafiq                      IND F   WFM  WFM
19892            Aadel F , Arvin              IND M 
.
.
.

我必须在linux命令行中计算这个文件中有多少女性F和男性M. 我是linux shell的新手，所以我只考虑grep命令，但“名称”中也可以有“M”和“F”。

有什么建议吗？

Answer 1

我会用awk来做这个（找到列，然后计算）：

$ awk '
# first line
NR == 1 { 
    if (col = index($0, "Sex")) {
        next # skip rest of script for this line
    }

    print "Could not find the required header"
    exit
} 

# all lines
{ 
    # increment counts of each `M` or `F`
    ++count[substr($0, col, 1)]
} 

END { 
    # loop through count array and print
    for (i in count) print i, count[i] 
}' file

Answer 2

首先使用cut来获取一列。类似的东西：

cut -c40 < file.txt # gets the 40th character on each line

然后计算不同的值：

cut -c40 < file.txt | sort | uniq -c

Answer 3

在使用GNU grep的bash中，你可以写：

IFS= read -r header < file          # read the first line of the file
prefix=${header%%Sex *}             # remove "Sex " and everthing after it
skip_regex=${prefix//?/.}           # replace all chars with "."

# then find the letters and count them
grep -oP "^$skip_regex\\K[MF]" file | sort | uniq -c

输出

  1 F
  2 M

计算字符的出现次数

3 个答案: