我正致力于自动化文件格式化。我的目标是获取这样的文件:
12154389,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG,Y,,,
12154390,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG,,,,
12154391,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG,Y,,,
12154392,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG,,,,
12154393,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG,,,,
12154394,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG,Y,,,
12154395,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG,,,,
12154396,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG,,,,
12154397,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG,Y,,,
12154398,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG,Y,,,
12154399,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG,,,,
12154400,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG,,,,
并将其转换为以下格式:
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG;
这是我现在正在使用的代码(我知道必须有更好的方法来解决这个问题,但我正在努力解决这个问题):
$condition0 = ".*/\\.*\\.*\\.*\\.*\\.*\\.*\\/";
open my $in, '<', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\sample.opt" or die "Can't read old file: $!";
open my $out, '>', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\working.opt" or die "Can't write new file: $!";
while( <$in> )
{
s/.JPG,,,,/.JPG;/g; #replace if not Y break
s/.JPG;\r/.JPG/g if /$condition0/; #remove extra new lines
s/JPG,Y,,,\n/JPG,Y,;/g; #remove ,,,, from non-Y breaks
s/(.*),,C:/C:/g; #gets rid of image key and ,,
print $out $_;
}
close $out;`
这里的最终目标是强制所有没有Y中断的行在Y行中断前一行,然后用分号替换,,,,或者Y,的所有实例。该文件的开头有一个键和两个逗号,它们也被删除。
如果您对我有任何建议或指示,请与我们联系。任何帮助表示赞赏!
答案 0 :(得分:1)
我不确定我是否完全按照你的问题行事,但是我已经制作了以下代码,根据你的输入产生你想要的输出,让我知道如果我错过任何东西,我会尝试更新。
use strict;
use warnings;
while (<DATA>) {
chomp(); #remove new lines
print "\n" if /,Y,,,$/; #if we have the Y marker then we should be starting on new line
s/^\d+,,(.*),Y?,,,$/$1;/; #now remove the image key and take just the path, and replace end commas with semi colon.
print $_; #print the line
}
print "\n";
__DATA__
12154389,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG,Y,,,
12154390,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG,,,,
12154391,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG,Y,,,
12154392,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG,,,,
12154393,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG,,,,
12154394,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG,Y,,,
12154395,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG,,,,
12154396,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG,,,,
12154397,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG,Y,,,
12154398,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG,Y,,,
12154399,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG,,,,
12154400,,C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG,,,,
这会产生输出
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154389.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154390.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154391.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154392.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154393.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154394.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154395.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154396.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154397.JPG;
C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154398.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154399.JPG;C:\Users\bhn\Documents\Scripting\Perl\Sample_Images\12154400.JPG;
答案 1 :(得分:0)
@克里斯
我正在尝试修改它,以便可以从文件中提取输入,然后打印到单独的输出文件。当我使用以下内容时,所有行都合并为一行:
use strict;
use warnings;
open my $in, '<', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\sample.opt" or die "Can't read old file: $!";
open my $out, '>', "C:\\Users\\bhn\\Documents\\Scripting\\Perl\\working.opt" or die "Can't write new file: $!";
while (<$in>) {
chomp(); #remove new lines
print "\n" if /,Y,,,$/; #if we have the Y marker then we should be starting on new line
s/^\d+,,(.*),Y?,,,$/$1;/; #now remove the image key and take just the path, and replace end commas with semi colon.
print $out $_; #print the line
}
print "\n";
有关为何发生这种情况的任何想法?