Question

我在文件中有这些行：

postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction                                                                     
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction

我需要分隔第二列值来处理它们。我已经完成了这段代码：

pid=$(cat idle_log.txt | cut -d" " -f2)
echo $pid

但它只给了我28811 32200 32252的结果。如果你看到列表中没有2609 2758的痕迹，我也想得到它们。我也想在提取pid后计算它们。我用过：

npid=$(grep -o " " <<< $pid | grep -c .)

它为28811 32200 32252的结果返回2我需要它返回3作为进程计数。最后我想逐行处理一些事情，就像在一个循环中一样，但是命令输出一次返回结果，我不能以循环格式逐个处理它们。

谢谢大家的帮助。

Answer 1

$ cat data 
postgres  2609 21030  0 12:49 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres  2758 21030  0 12:51 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 28811 21030  0 09:26 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32200 21030  0 11:40 ?        00:00:00 postgres: postgres postgres [local] idle in transaction
postgres 32252 21030  0 11:41 ?        00:00:00 postgres: postgres postgres [local] idle in transaction   I need to extract second column from each line, 
$ awk '{print $2}' data 
2609
2758
28811
32200
32252

或者您可以使用tr将多个空格压缩为1，然后像这样使用cut：

$ tr -s ' ' < data | cut -d ' ' -f 2
2609
2758
28811
32200
32252

编辑：

$ tr -s ' ' < data | cut -d ' ' -f 2 | while read -r line || [[ -n "$line" ]]; do
> echo "$line" #put your custom processing logic here
> done
2609
2758
28811
32200
32252

Answer 2

您可以使用tr来挤压空格，然后使用cut获取第二个以空格分隔的字段：

tr -s ' ' <idle_log.txt | cut -d' ' -f2

或awk：

awk '{ print $2 }' idle_log.txt

或sed：

sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' idle_log.txt

或grep：

grep -Po '^[^\s]+\s+\K[^\s]+' idle_log.txt

稍后使用/计算它们使用数组：

pids=( $(tr -s ' ' <idle_log.txt | cut -d' ' -f2) )

num_of_pids="${#pids[@]}"

$ printf '%s\n' "${pids[@]}" 
2609
2758
28811
32200
32252

示例：

$ tr -s ' ' <file.txt | cut -d' ' -f2 2609 2758 28811 32200 32252 $ awk '{ print $2 }' file.txt 2609 2758 28811 32200 32252 $ sed -r 's/^[^[:blank:]]+[[:blank:]]+([^[:blank:]]+)(.*)/\1/' file.txt 2609 2758 28811 32200 32252 $ grep -Po '^[^\s]+\s+\K[^\s]+' file.txt 2609 2758 28811 32200 32252

Answer 3

cut完全使用您传递的分隔符。这意味着使用分隔符' '，第一行是：

postgres, <empty>, 2609

最后一个是：

postgres, 32252

您可以只运行awk '{print $2}' idle_log.txt

来简化此操作

Answer 4

使用Perl正则表达式grep：

msg

或者，

grep -oP '^[\S]+\s+\K[\S]+' file
2609
2758
28811
32200
32252

Answer 5

我选择最简单的解决方案：

pid=$(awk '{print $2}' idle_log.txt)
echo $pid

sed和grep的正则表达式在脚本中的可读性要低得多，而cut和tr有时会产生意想不到的结果。

Answer 6

正如已经指出的那样，原因是你没有得到，结果是你没有提取第二列。

相反，您使用的是命令cut -d" " -f2，因此您获得了每行的第二个表空间拆分。您可能会看到两个第一行有一个额外的表空间，因此您应该使用cut -d" " -f3，但正如所讨论的，这不是获取第二列的正确方法。请改用awk '{print $2}'。

从bash中的文件行中获取字符串

6 个答案: