Question

我有多个文件，其数据排列方式如下（列之间用空格隔开（此处用“-”表示））：

数据
2500-OPQ--1000--UVA--XYZ
2501-LMN--1001--VNZ--OPQ
2502-OPQ --- 1002-USS--MNO
2503-LMN --- 1003-PQR--OGD
数据

在特定的行（例如2502）之后，我想移动第3列的内容，以使整个第3列完全对齐，并且文件的其余部分保持原样（第4个数据和第2个数据第五列的字符数可能有所不同）：

数据
2500-OPQ--1000--UVA--XYZ
2501-LMN--1001--VNZ--OPQ
2502-OPQ--1002--USS--MNO
2503-LMN--1003--PQR--OGD
数据

我想为此使用bash或python脚本。

For more clarity, attached please find a figure showing the files I am working on. Left: Original file. Right: Modified file. After line 2997 (see column 2), column 5 is moved to the left, while the rest stays the same.

Answer 1

这将满足您的需求：

public class MyDataGridControl : DataGrid
{
    public string SomeName
    {
        get { return (string)GetValue(SomeNameProperty); }
        set { SetValue(SomeNameProperty, value); }
    }

    public static readonly DependencyProperty SomeNameProperty = 
        DependencyProperty.Register(
            nameof(SomeName), typeof(string), typeof(MyDataGridControl),
            new PropertyMetadata(null));
}

输出：

$ awk -F'-*' '{ if ( $1 > 2501 && $1 ~ /[0-9]+/ ) { print $1"-"$2"--"$3"--"$4"--"$5 } else if($1 ~ /[0-9]+/) { print $0}  }' input.txt

为了完整起见，您有一个Python 3版本：

2500-OPQ--1000--UVA--XYZ
2501-LMN--1001--VNZ--OPQ
2502-OPQ--1002--USS--MNO
2503-LMN--1003--PQR--OGD

致谢！

Answer 2

请尝试：

awk '{printf "%4s%7d  %-3s %5s%4d %s\n", $1, $2, $3, $4, $5, substr($0, 28)}' input_file

其中input_file是从您的图片中提取出来的，看起来像这样：

ATOM   2996  H1  TIP3G 999     -14.190 -28.766  -0.221  0.00  0.00
ATOM   2997  H2  TIP3G 999     -14.390 -29.221  -1.040  0.00  0.00
ATOM   2998  OH2 TIP3G 1000     21.160  21.806  10.603  0.00  0.00
ATOM   2999  H1  TIP3G 1000     21.614  20.972  10.722  0.00  0.00

和输出：

ATOM   2996  H1  TIP3G 999     -14.190 -28.766  -0.221  0.00  0.00
ATOM   2997  H2  TIP3G 999     -14.390 -29.221  -1.040  0.00  0.00
ATOM   2998  OH2 TIP3G1000      21.160  21.806  10.603  0.00  0.00
ATOM   2999  H1  TIP3G1000      21.614  20.972  10.722  0.00  0.00

您可以通过修改printf中的格式字符串来调整列的间距和/或位置。

如果要指定要处理的行，例如all lines after 2502，您可以这样说：

awk 'NR<2502 {print; next} {printf "%4s%7d  %-3s %5s%4d %s\n", $1, $2, $3, $4, $5, substr($0, 28)}' input_file

它输出未触摸2502之前的行，然后重新格式化2502之后的行，尽管我不确定是否需要这样的切换。

Answer 3

awk '{sub(/---/,"--")sub(/1002-|1003-/,"&-")}1' file

data
2500-OPQ--1000--UVA--XYZ
2501-LMN--1001--VNZ--OPQ
2502-OPQ--1002--USS--MNO
2503-LMN--1003--PQR--OGD
data

使用Shell脚本复制和替换文件中特定行的列

3 个答案:

输出：