脚本正在做什么的细节并不重要,但是我已经对我的重要内容进行了评论,我只关心为什么我的输出中出现空白行
当我运行命令
时./script.pl temp temp.txt tempF `wc -l temp | awk '{print $1}'`
临时文件包含
1 27800000 120700000 4
1 27800000 124300000 4
1 154800000 247249719 3
3 32100000 71800000 9
3 32100000 87200000 2
3 54400000 74200000 15
4 76500000 155100000 20
4 76500000 182600000 3
4 76500000 88200000 77
4 88200000 124000000 2
5 58900000 180857866 8
5 58900000 76400000 2
5 58900000 97300000 4
5 76400000 143100000 14
5 97300000 147200000 6
6 7000000 29900000 2
6 63500000 70000000 73
6 63500000 92100000 4
6 70000000 113900000 70
6 70000000 139100000 57
6 92100000 113900000 3
我正在获取表格的输出
hs1 27800000 124300000 4
hs3 32100000 87200000 2
hs3 54400000 74200000 15
hs4 76500000 182600000 3
hs4 76500000 88200000 77
hs4 88200000 124000000 2
hs5 58900000 76400000 2
hs5 58900000 97300000 4
hs5 76400000 143100000 14
hs5 97300000 147200000 6
hs6 63500000 92100000 4
hs6 70000000 139100000 57
hs6 92100000 113900000 3
标准输出(大约8行也打印到temp.txt文件,但这些行的格式是正确的)
这是下面的脚本
#!/usr/bin/perl
# ARGV[0] is the name of the file which data will be read from(may have overlaps)
# ARGV[1] is the name of the file which will be produced that will have no overlaps
# ARGV[2] is the name of the folder which will hold all the data
# ARGV[3] is the number of lines that ARGV[0] will contain
use warnings;
my $file = "./$ARGV[0]";
my @lines = do {
open my $fh, '<', $file or die "Can't open $file -- $!";
<$fh>;
};
my $file2 = "./$ARGV[2]/$ARGV[1]";
open( my $files, ">", "$file2" ) or die "Can't open > $file2: $!";
my $i = 0;
while ( $i < $ARGV[3] - 1 ) {
my @ref_fields = split( '\s+', $lines[$i] );
print $files
"$ref_fields[0]", "\t",
$ref_fields[1], "\t",
$ref_fields[2], "\t",
$ref_fields[3], "\n";
for my $j ( $i + 1 .. $ARGV[3] - 1 ) {
$i = $j;
# @curr_fields is initialized here
my @curr_fields = split /\s+/, $lines[$j];
if ( $ref_fields[0] eq $curr_fields[0] && $ref_fields[2] > $curr_fields[1] ) {
if ( defined( $curr_fields[0] ) && $curr_fields[0] !~ /\s+/ ) {
chomp $curr_fields[3];
# the line below is the one that is printing to standard output
print
$curr_fields[0], "\t",
$curr_fields[1], "\t",
$curr_fields[2], "\t",
$curr_fields[3], "\n";
}
}
else {
last;
}
}
print "\n";
}
编辑:
从发布的答案运行脚本时发现错误 当我运行命令
时./script.pl temp1 temp10.txt folder
temp1包含
的地方12 58100000 96200000 0.04348
3 74200000 87200000 0.04348
5 130600000 168500000 0.04348
6 61000000 114600000 0.04348
6 75900000 114600000 0.04348
6 88000000 114600000 0.04348
6 88000000 139000000 0.04348
6 93100000 161000000 0.04348
6 105500000 139000000 0.04348
6 130300000 139000000 0.04348
7 59900000 77500000 0.04348
7 98000000 132600000 0.04348
X 67800000 76000000 0.08696
Y 28800000 59373566 0.04348
我得到了
6 75900000 114600000 0.04348
6 88000000 114600000 0.04348
6 88000000 139000000 0.04348
6 93100000 161000000 0.04348
6 105500000 139000000 0.04348
temp10.txt包含
12 58100000 96200000 0.04348
3 74200000 87200000 0.04348
5 130600000 168500000 0.04348
6 61000000 114600000 0.04348
6 130300000 139000000 0.04348
7 59900000 77500000 0.04348
7 98000000 132600000 0.04348
X 67800000 76000000 0.08696
该行
Y 28800000 59373566 0.04348
既不在输出中也不在temp10.txt中。它似乎已经消失了但应该打印到其中一个
答案 0 :(得分:2)
显然空行是打印的,因为你有一行
scala.collection.JavaConversions._
代码
我无法帮助更多,因为你说“脚本正在做的事情的细节并不重要”,所以我们不知道它是什么意思正在做
但是,只要第一列与上一行中的第一列匹配且第二列小于上一行中的第三列,您所写的内容就会从输入文件中打印行。任何时候你得到一个不符合这种方式的行你打印一个空行
您可能更喜欢对代码进行重构,这些代码的行为相同,但我认为更具可读性。它还具有将每个行与输入文件分开一次的优点,并且不需要第四个参数,因为行数只是print "\n";
数组的大小。读取时会从文件中删除空行,因此不再需要检查第一个字段的定义
@lines