LOG文件中的条件替换

时间:2015-04-09 13:52:11

标签: regex perl formatting

我正致力于自动化文件格式化。我的目标是获取这样的文件:

12154389,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG,Y,,,
12154390,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG,,,,
12154391,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG,Y,,,
12154392,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG,,,,
12154393,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG,,,,
12154394,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG,Y,,,
12154395,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG,,,,
12154396,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG,,,,
12154397,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG,Y,,,
12154398,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG,Y,,,
12154399,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG,,,,
12154400,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG,,,,

并将其转换为以下格式:

C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG;

这是我现在正在使用的代码(我知道必须有更好的方法来解决这个问题,但我正在努力解决这个问题):

$condition0 = ".*/\\.*\\.*\\.*\\.*\\.*\\.*\\/";
open my $in,  '<',  "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\sample.opt"              or die "Can't read old file: $!";
open my $out, '>', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\working.opt"  or die "Can't write new file: $!";


while( <$in> )
    {

s/.JPG,,,,/.JPG;/g;         #replace if not Y break
s/.JPG;\r/.JPG/g if /$condition0/;  #remove extra new lines
s/JPG,Y,,,\n/JPG,Y,;/g;     #remove ,,,, from non-Y breaks
s/(.*),,C:/C:/g;            #gets rid of image key and ,,


print $out $_;
}

close $out;`

这里的最终目标是强制所有没有Y中断的行在Y行中断前一行,然后用分号替换,,,,或者Y,的所有实例。该文件的开头有一个键和两个逗号,它们也被删除。

如果您对我有任何建议或指示,请与我们联系。任何帮助表示赞赏!

2 个答案:

答案 0 :(得分:1)

我不确定我是否完全按照你的问题行事,但是我已经制作了以下代码,根据你的输入产生你想要的输出,让我知道如果我错过任何东西,我会尝试更新。

use strict;
use warnings;

while (<DATA>) {
        chomp(); #remove new lines
        print "\n" if /,Y,,,$/; #if we have the Y marker then we should be starting on new line
        s/^\d+,,(.*),Y?,,,$/$1;/; #now remove the image key and take just the path, and replace end commas with semi colon.
        print $_; #print the line
}
print "\n";


__DATA__
12154389,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG,Y,,,
12154390,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG,,,,
12154391,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG,Y,,,
12154392,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG,,,,
12154393,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG,,,,
12154394,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG,Y,,,
12154395,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG,,,,
12154396,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG,,,,
12154397,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG,Y,,,
12154398,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG,Y,,,
12154399,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG,,,,
12154400,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG,,,,

这会产生输出

C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG;

答案 1 :(得分:0)

@克里斯

我正在尝试修改它,以便可以从文件中提取输入,然后打印到单独的输出文件。当我使用以下内容时,所有行都合并为一行:

use strict;
use warnings;

open my $in,  '<',  "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\sample.opt"    or die "Can't read old file: $!";
open my $out, '>', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\working.opt" or die "Can't write new file: $!";

while (<$in>) {
    chomp(); #remove new lines
    print "\n" if /,Y,,,$/; #if we have the Y marker then we should be   starting on new line
    s/^\d+,,(.*),Y?,,,$/$1;/; #now remove the image key and take just the     path, and replace end commas with semi colon.
    print $out $_; #print the line
}
print "\n";

有关为何发生这种情况的任何想法?