按字母顺序排序模块子程序

时间:2014-10-22 05:54:33

标签: perl

我想按字母顺序对模块子程序进行排序(我有很多子程序,如果子程序在文件中排序,我认为编辑文件会更容易)。例如,给定A.pm

package A;
use warnings;
use strict;

sub subA {
  print "A\n";
}

sub subC {
  print "C\n";
}
sub subB {
  print "B\n";
}

1;

我想运行sortSub A.pm给出:

package A;
use warnings;
use strict;

sub subA {
  print "A\n";
}
sub subB {
  print "B\n";
}
sub subC {
  print "C\n";
}
1;

是否有任何CPAN资源可以帮助完成此任务?

2 个答案:

答案 0 :(得分:4)

要解析和重新格式化Perl代码,您应该使用PPI

这是Perl::CriticPerl::Tidy用来完成所有专长的工具。

在这种情况下,我研究了PPI::Dumper的代码,以了解如何导航PPI返回的文档树。

以下内容将解析源代码并分离出包含子例程和注释的部分。它会在子程序之前将注释,pod和空格绑定,然后按名称对所有相邻的subs进行排序。

use strict;
use warnings;

use PPI;
use Data::Dump;

my $src = do { local $/; <DATA> };

# Load a document
my $doc = PPI::Document->new( \$src );

# Save Sub locations for later sorting
my @group = ();
my @subs  = ();

for my $i ( 0 .. $#{ $doc->{children} } ) {
    my $child = $doc->{children}[$i];

    my ( $subtype, $subname )
        = $child->isa('PPI::Statement::Sub')
        ? grep { $_->isa('PPI::Token::Word') } @{ $child->{children} }
        : ( '', '' );

    # Look for grouped subs, whitespace and comments.  Sort each group separately.
    my $is_related = ($subtype eq 'sub') || grep { $child->isa("PPI::Token::$_") } qw(Whitespace Comment Pod);

    # State change or end of stream
    if ( my $range = $is_related .. ( !$is_related || ( $i == $#{ $doc->{children} } ) ) ) {
        if ($is_related) {
            push @group, $child;

            if ( $subtype ) {
                push @subs, { name => "$subname", children => [@group] };
                @group = ();
            }
        }

        if ( $range =~ /E/ ) {
            @group = ();

            if (@subs) {
                # Sort and Flatten
                my @sorted = map { @{ $_->{children} } } sort { $a->{name} cmp $b->{name} } @subs;

                # Assign back to document, and then reset group
                my $min_index = $i - $range + 1;
                @{ $doc->{children} }[ $min_index .. $min_index + $#sorted ] = @sorted;

                @subs = ();
            }
        }
    }
}

print $doc->serialize;

1;

__DATA__
package A;
use warnings;
use strict;

=comment
Pod describing subC
=cut
sub subC {
    print "C\n";
}

INIT {
    print "Hello World";
}

sub subB {
    print "B\n";
}

# Hello subA comment
sub subA {
    print "A\n";
}

1;

输出:

package A;
use warnings;
use strict;

=comment
Pod describing subC
=cut
sub subC {
    print "C\n";
}

INIT {
    print "Hello World";
}

# Hello subA comment
sub subA {
    print "A\n";
}

sub subB {
    print "B\n";
}

1;

答案 1 :(得分:1)

首先,这是我的解决方案;

#!/bin/sh
TOKEN=sub

gsed -e ':a;N;$!ba;s/\n/__newline__/g' "$1" > "$1.out"
gsed -i "s/__newline__\\s*$TOKEN\W/\\nsub /g" "$1.out"
sort $1.out -o $1.out
gsed -i 's/__newline__/\n/g' $1.out

用法:token_sort.sh myfile.pl

这是它的工作原理;

  • 使用占位符__newline__
  • 替换所有换行符
  • 将所有$TOKENS(在本例中为sub)分解为自己的行
  • 使用unix排序对行进行排序
  • 替换所有换行符
  • 您现在应该在myfile.pl.out
  • 中拥有文件的已排序副本

一些警告;

  • 添加评论&#34;#Something&#34;,或&#34;#!/ usr / bin / env perl&#34;到文件的顶部;这将确保标题栏保持排在最前面。
  • 已排序的块将是当前子到下一个子的开头 - 子上面的注释将与前一个子进行排序。
  • 你需要使用gnu-sed才能工作,在Mac上这意味着要做一个&#34; brew install gnu-sed&#34;