Question

我有一个管道分隔的日志文件，其格式如下：

<date>  <time> | <fruit> | <color> | <num_1> | <num_2> | <num_3>

例如：

2013-03-27  23:01:52 | apple | green | 55 | 120 | 29
2013-03-27  23:01:56 | plumb | purple | 28 | 1 | 394
2013-03-27  23:01:59 | apple | red | 553 | 21 | 7822

我想编写一个perl脚本（虽然python或bash也可以接受），greps和<date>字段（第1列）<time>和{{1} }，<num_1>或<num_2>，具体取决于您为脚本提供的输入。因此，对上述信息运行<num_3>将为您提供perl extract.pl 2，<date>和<time>：

<num_2>

我尝试了以下但它似乎不起作用：

2013-03-27  23:01:52 | 120
2013-03-27  23:01:56 | 1
2013-03-27  23:01:59 | 21

在这里，我将#!/usr/bin/perl use warnings; use strict; my $col = $1; print `grep "myapplog.txt" "m/_(\d{4})(\d\d)(\d\d)/ | $col"` var设置为脚本的第一个arg，然后尝试打印匹配第一列的日期时间和欲望col列的grep。有任何想法吗？提前谢谢。

Answer 1

尝试在awk-mode中使用perl

$ perl -F'\|' -lane 'print $F[0]," | ", $F[4]' input
2013-03-27  23:01:52  |  120 
2013-03-27  23:01:56  |  1 
2013-03-27  23:01:59  |  21

Pure awk：

awk -F"|" '{print $1, "|", $5}' input

Pure bash：

#!/bin/bash

IFS="|"

while read -a ARRAY;
do
    echo ${ARRAY[0]} "|" ${ARRAY[4]}
done < input

<强>更新

传球，例如awk-solution的参数，用于确定要打印的女性列，使用：

$ awk -vcol="5" -F"|" '{print $1, "|", $col}' input

在bash中，函数/脚本的第一个参数驻留在$1中，因此将其用作ARRAY的索引。

使用python：

，比单行更正式

#!/usr/bin/env python

import sys

col = raw_input('which column to print? -> ')
try:
    col = int(col)
except ValueError:
    print >> sys.stderr, "That was no integer"

with open("input") as fd:
    for line in fd:
        tmp = line.strip().split('|')
        print tmp[0], "|", tmp[col]

Answer 2

尝试这样做

使用您希望的第一个参数（使用@ARGV数组，而不是$1中的perl）：

#!/usr/bin/perl

use warnings; use strict;
use autodie; # No need to check open() errors

$\ = "\n";   # output record separator (no need \n)

# file-handle
open my $fh, "<", "myapplog.txt";

chomp(my $col = $ARGV[0]);

die("Not an integer !\n") unless $col =~ /^\d+$/;

# using the famous and magical <diamond> operator:
while (<$fh>) {
    chomp;
    my @F = split /\|/; # splitting current line in @F array
    print join("|", @F[0,$col+2]); # join on a array slice
}

close $fh;

用于配置grep输出的Perl脚本

2 个答案: