Question

我有以下数据文件：

136110828724515000007700877  
137110904734015000007700877  
138110911724215000007700877  
127110626724515000007700871  
127110626726015000007700871  
131110724724515000007700871  
134110814725015000007700871  
134110814734015000007700871  
104110122726027000001810072  
107110208724527000002900000

我想提取第3列的值，即值6787714447。我尝试使用： -

awk "print $3" <filename>

但它不起作用。我该怎么用？

Answer 1

对于cut来说，这是一个更好的工作：

$ cut -c 3 < file
6
7
8
7
7
1
4
4
4
7

根据man cut：

-c， - characters = LIST

仅选择这些字符

要使它们全部显示在同一行中，请管道tr -d '\n'：

$ cut -c 3 < file | tr -d '\n'
6787714447

甚至到sed最后都有新行：

$ cut -c 3 < file | tr -d '\n' | sed 's/$/\n/'
6787714447

使用grep：

$ grep -oP "^..\K." file
6
7
8
7
7
1
4
4
4
7

sed：

$ sed -r 's/..(.).*/\1/' file
6
7
8
7
7
1
4
4
4
7

awk：

$ awk '{split ($0, a, ""); print a[3]}' file
6
7
8
7
7
1
4
4
4
7

Answer 2

Cut可能是更简单/更清洁的选择，但这里有两个选择：

AWK版本：

awk '{print substr($1, 3, 1) }' <filename>

Python版本：

python -c 'print "\n".join(map(lambda x: x[2], open("<filename>").readlines()))'

编辑：请参阅1_CR的评论并忽略此选项以支持他的。

在unix文件中搜索列？

2 个答案: