如何剪切文件中的每个单词

时间:2015-08-18 04:52:31

标签: regex perl unix cut

是否有一个命令可以将制表符分隔文件转换为每个单词的前4个字母?

EG。转动此文件

Jackal Poorest Kingship Twinkle
Viscount George Lizard
Stone Goose Elephant Yvonne Chicken
Gecko Amoeba
Richard

到此文件

Jack Poor King Twin
Visc Geor Liza
Ston Goos Elep Yvon Chic
Geck Amoe
Rich

由于

5 个答案:

答案 0 :(得分:1)

使用substr修剪每个单词。命名以下trim.pl

#!/usr/bin/env perl

use strict;
use warnings;

while (<>) {
    chomp;
    my @words = split /\s+/;
    my @trim;
    for my $word (@words) {
        push @trim, substr($word,0,4);
    }
    print join ' ', @trim;
    print "\n";
}

将其运行为:

cat names.txt | trim.pl

哪个输出:

Jack Poor King Twin
Visc Geor Liza
Ston Goos Elep Yvon Chic
Geck Amoe
Rich

答案 1 :(得分:1)

尝试使用此模式匹配足以执行此操作

while(<DATA>)
{
(@ar) = $_ =~m/(.{4}).+?\s/g;
print "@ar\n";
}
__DATA__
Jackal Poorest Kingship Twinkle
Viscount George Lizard
Stone Goose Elephant Yvonne Chicken
Gecko Amoeba
Richard

output
Jack Poor King Twin
Visc Geor Liza
Ston Goos Elep Yvon Chic
Geck Amoe
Rich

答案 2 :(得分:1)

perl -lane '$,=" "; print map substr($_,0,4),@F' input

答案 3 :(得分:1)

来自命令行的Perl,

perl -anE 'say join " ", map /(.{1,4})/, @F' file.txt

use feature 'say';

while (my $line = <>) {
  my @F = split ' ', $line;
  say join " ", map /(.{1,4})/, @F; 
}

答案 4 :(得分:1)

更易读的awk版本

awk '{l=sep=""; for(i=1;i<=NF;i++){l = l sep substr($i,1,4); sep=FS}; print l}'