Question

我在linux bash环境下工作，我有很多文件要编辑，900左右。在一个文件filename.txt中，我有文件名列表，每行一个文件名。例如

ab2.pdb.101
ab2.pdb.109
ab2.pdb.126
ab2.pdb.127
ab2.pdb.13
ab2.pdb.187
ab2.pdb.188

这些文件的前几行是（总共245行）

REMARK   1                     PDB file generated by ptraj (set    33)
ATOM      1  N   ALA     1      11.304   3.018  20.878  0.1414  1.8240
ATOM      2  H1  ALA     1      11.574   3.686  21.593  0.1997  0.6000
ATOM      3  H2  ALA     1      11.901   3.162  20.074  0.1997  0.6000
ATOM      4  H3  ALA     1      10.342   3.207  20.625  0.1997  0.6000
ATOM      5  CA  ALA     1      11.449   1.637  21.381  0.0962  1.9080
ATOM      6  HA  ALA     1      12.509   1.464  21.561  0.0889  1.1000

我想将第二行的最后两列数字替换为0.0000 0.0000

的文件末尾

0.1414  1.8240
0.1997  0.6000
0.1997  0.6000
0.1997  0.6000
0.0962  1.9080
0.0889  1.1000

到

0.0000  0.0000
0.0000  0.0000
0.0000  0.0000
0.0000  0.0000
0.0000  0.0000
0.0000  0.0000

所以我想在一个文件中读取哪个文件名在一个名为“filenames.txt”的文本文件中，并将最后两个列号替换为0.0000。

谢谢大家的帮助。

Answer 1

此代码使用head获取第一行，tail获取其余内容，cut仅获取起始列，paste添加其他列，（{1}}这两个假设选项卡用于分隔列，yes以生成列。

#! /bin/bash
while read file ; do {
        head -n1 "$file"
        tail -n+2 "$file" | \
            cut -f1-8 | \
            paste - <( yes 0.0000$'\t'0.0000 | \
            head -n $(( $( wc -l < "$file")-1 ))
        )
    }  > "$file".new
done < filenames.txt

<强>更新如果文件的结构更复杂，我会使用比bash更舒服的东西。例如，这是在Perl中的方法：

#!/usr/bin/perl
use warnings;
use strict;

open my $NAMES, '<', 'filenames.txt' or die $!;
for my $file (<$NAMES>) {
    chomp $file;
    open my $FILE, '<', $file or die $!;
    open my $NEW,  '>', "$file.new" or die $!;
    print {$NEW} scalar <$FILE>;               # print 1st line
    while (<$FILE>) {
        my @fields = split /(\s+)/;            # keep separators
        @fields[-4, -2] = ('0.0000') x 2;      # replace the last two non-whitespace columns
        print {$NEW} @fields;
    }
}

Answer 2

我确信有一种更好的方法可以在列之间指定标签，但它不会出现给我：

#!/bin/bash

# create a list of the files to edit
declare -a FILES=(
    ab2.pdb.101
    ab2.pdb.109
    ab2.pdb.126
    ab2.pdb.127
    ab2.pdb.13
    ab2.pdb.187
    ab2.pdb.188
)

# iterate over the list
for FILE in ${FILES[@]};
do
    NEW=$FILE.new
    head -1 $FILE > $NEW
    cat $FILE | awk 'NR>1 { print $1,"\t",$2,"\t",$3,"\t",$4,"\t",$5,"\t",$6,"\t","0.000","\t","0.000" }' >> $FILE.new
done

希望这有帮助。

实际上......我不确定你是否也希望保留文件中的第一行。如果你这样做，请告诉我，我会对此进行修改。

<强> EDITED

更新以包含每个文件的标题行：）

Answer 3

试试这个：

#!/bin/bash
for file in $(cat filename.txt);
do
    perl -pi -e 's/\d+(\.\d+)?(\s+)\d+(\.\d+)?$/0.0000${2}0.0000/g' $file
done

正则表达式的解释：

$表示匹配行尾
\d+(\.\d+)?表示匹配一个带有可选小数部分的数字
(\s+)用于“复制”空白，以便将其保留在替换

我知道它不是“纯粹的”Bash，但我希望可以接受一次对Perl的调用。

Answer 4

有趣的是，每个人都有自己选择的语言解决方案。就我个人而言，我也会使用Perl，但是在这个混合中添加更多东西，用sed bash怎么样？

#!/bin/bash
function fixfile() {
  #skip the 'REMARK' line and any blank lines, replace other lines
  sed '/^REMARK.*/d' $1 | sed '/^ *$/d' | sed 's/^.*/0.0000  0.0000/' > $1$$
  mv $1$$ $1
}

for fname in `cat filelist`; do
  fixfile $fname
done

您说“将最后两个列号替换为0.0000”，但您的示例显示使用固定的“0.0000 0.0000”完全替换这些行。你的意思是保留行并替换最后两列，还是你真的想要替换整行？

删除列并插入数字

4 个答案: