我想创建一个结合多个文件列的perl脚本。我必须尊重一系列标准(文件夹/文件结构)。我会尝试代表我拥有的和我拥有的东西。我有两个文件夹和一堆文件。每个文件夹中的文件具有相同的名称。
Folder1:File1,File2,File3,...
Folder2:File1,File2,File3,...
Folder1:File1内容如下所示(制表符分隔):
aaaaa 233
bbbbb 34
ccccc 853
...
除了数值不同之外,所有其他文件看起来都是这样的。我想创建一个如下所示的单个文件(报告):
aaaaa value_Folder1:File1 value_Folder2:File1 value_Folder1:File2 value_Folder2:File2 ...
...
将文件名放在值来自的列之上(只是文件名,文件夹并不重要)会很好。
我有一些代码正在发展,但它现在没有做我想要的!我尝试通过循环使其工作,但我觉得它可能不是解决方案...另一个问题是我不知道如何将列添加到我的报告文件中。在下面的代码中,我只是将值附加到文件的末尾。即使它不是超级好,这是我的代码:
#!/usr/bin/perl -w
use strict;
use warnings;
my $outputfile = "/home/duceppemo/Desktop/count.all.txt";
my $queryDir = "/home/duceppemo/Desktop/query_count/";
my $hitDir = "/home/duceppemo/Desktop/hit_count/";
opendir (DIR, "$queryDir") or die "Error opening $queryDir: $!"; #Open the directory containing the files with sequences to look for
my @queryFileNames = readdir (DIR);
opendir (DIR, "$hitDir") or die "Error opening $hitDir: $!"; #Open the directory containing the files with sequences to look for
my @hitFileNames = readdir (DIR);
my $index = 0;
$index ++ until $queryFileNames[$index] eq ".";
splice(@queryFileNames, $index, 1);
$index = 0;
$index ++ until $queryFileNames[$index] eq "..";
splice(@queryFileNames, $index, 1);
$index = 0;
$index ++ until $hitFileNames[$index] eq ".";
splice(@hitFileNames, $index, 1);
$index = 0;
$index ++ until $hitFileNames[$index] eq "..";
splice(@hitFileNames, $index, 1);
#counter for query file number opened
my $i = 0;
foreach my $queryFile (@queryFileNames) #adjust the file name according to the subdirectory
{
$i += 1; #keep track of the file number opened
$queryFile = $queryDir . $queryFile;
open (QUERY, "$queryFile") or die "Error opening $queryFile: $!";
my @query = <QUERY>; #Put the query sequences from the count file into an array
close (QUERY);
my $line = 0;
open (RESULT, ">>$outputfile") or die "Error opening $outputfile: $!";
foreach my $lineQuery (@query) #look into the query file
{
my @columns = split(/\s+/, $lineQuery); #Split each line into a new array, when it meets a whitespace character (including tab)
if ($i == 1)
{
#open (RESULT, ">>$outputfile") or die "Error opening $outputfile: $!";
print RESULT "$columns[0]\t";
print RESULT "$columns[1]\n";
#close (RESULT);
$line += 1;
}
else
{
open (RESULT, ">>$outputfile") or die "Error opening $outputfile: $!";
print RESULT "$columns[1]\n";
close (RESULT);
$line += 1;
}
}
$line = 0;
}
close (RESULT);
closedir (DIR);
P.S。关于代码优化的任何其他建议都要感激不尽!
答案 0 :(得分:1)
主要问题是你似乎不明白什么是FILEHANDLE。你应该对此进行研究。
Filehandle是对打开文件的一种引用,因为一切都是文件,所以它可以是命令或目录。
当您制作opendir(DIR,...)时,“DIR”不是关键字,而是可以具有任何名称的文件句柄。这意味着你的2 opendir()具有相同的文件句柄,这没有意义。
应该更像是:
opendir(QDIR, $queryDir) or die "Error opening $queryDir: $!";
my @queryFileNames = readdir(QDIR);
opendir(HDIR, $hitDir) or die "Error opening $hitDir: $!";
my @hitFileNames = readdir(HDIR);
此外,由于您应该始终关闭每个打开的文件句柄,因此必须在同一级别调用close()并确保调用close()。
e.g。文件句柄RESULT的打开及其在打开循环后的关闭没有意义...你打开它多少次而不关闭它?
您可能需要在循环之前打开它,而不必使用相同的文件句柄打开它两次......
通常,您希望避免打开/关闭循环。您只需在之前和之后打开。
答案 1 :(得分:0)
该代码正在做我想要的事情:
#!/usr/bin/perl
use strict;
use warnings;
#my $queryDir = "ARGV[0]";
my $queryDir = "C:/Users/Marco/Desktop/query_count/";
opendir (DIR1, "$queryDir") or die "Error opening $queryDir: $!"; #Open the directory containing the files with sequences to look for
my @queryFileName = readdir (DIR1);
#my $hitDir = "ARGV[1]";
my $hitDir = "C:/Users/Marco/Desktop/hit_count/";
opendir (DIR2, "$hitDir") or die "Error opening $hitDir: $!"; #Open the directory containing the files with sequences to look for
my @hitFileName = readdir (DIR2);
my $index = 0;
$index ++ until $queryFileName[$index] eq ".";
splice(@queryFileName, $index, 1);
$index = 0;
$index ++ until $queryFileName[$index] eq "..";
splice(@queryFileName, $index, 1);
$index = 0;
$index ++ until $hitFileName[$index] eq ".";
splice(@hitFileName, $index, 1);
$index = 0;
$index ++ until $hitFileName[$index] eq "..";
splice(@hitFileName, $index, 1);
foreach my $queryFile (@queryFileName) #adjust the queryFileName according to the subdirectory
{
$queryFile = "$queryDir" . $queryFile;
}
foreach my $hitFile (@hitFileName) #adjust the queryFileName according to the subdirectory
{
$hitFile = "$hitDir" . $hitFile;
}
my $outputfile = "C:/Users/Marco/Desktop/out.txt";
my %hash;
foreach my $queryFile (@queryFileName)
{
my $i = 0;
open (QUERY, "$queryFile") or die "Error opening $queryFile: $!";
while (<QUERY>)
{
chomp;
my $val = (split /\t/)[1];
$i++;
$hash{$i}{$queryFile} = $val;
}
close (QUERY);
}
foreach my $hitFile (@hitFileName)
{
my $i = 0;
open (HIT, "$hitFile") or die "Error opening $hitFile: $!";
while (<HIT>)
{
chomp;
my $val = (split /\t/)[1];
$i++;
$hash{$i}{$hitFile} = $val;
}
close (HIT);
}
open (RESULT, ">>$outputfile") or die "Error opening $outputfile: $!";
foreach my $qfile (@queryFileName)
{
print RESULT "\t$qfile";
}
foreach my $hfile (@hitFileName)
{
print RESULT "\t$hfile";
}
print RESULT "\n";
foreach my $id (sort keys %hash)
{
print RESULT "$id\t";
print RESULT "$hash{$id}{$_}\t" foreach (@queryFileName, @hitFileName);
print RESULT "\n";
}
close (RESULT);