我有一个目录,里面有几千个.txt文件。我需要在每个.txt文件上运行相同的perl脚本,并在每个.txt文件上运行脚本时,为该文件命名一个唯一的名称。请原谅基本的查询,因为我第一次使用这个脚本学习perl。
我看过其他帖子解决了这个问题:How can I loop through files in a directory in Perl?并在终端上运行循环:Take all files in dir and for each file do the same perl procedure.
关于我的数据:这些是blastx主题序列ID结果
$head file1.txt
GCN2_SCHPO
GCN2_YEAST
GCN20_YEAST
GCNK_GLUOX
$head file2.txt
PDXA_RUEST
PDXA_SULSY
PDXA_SYNFM
PDXA_SYNY3
我的perl脚本 - 以编程方式使用Uniprot的检索/ ID映射服务,而不是手动(Retrieve/ID mapping)输入数千个请求:
use warnings;
use LWP::UserAgent;
@files = <*.txt>; # File containg list of UniProt IDs.
my $base = 'http://www.uniprot.org';
my $tool = 'uploadlists';
my $contact = ''; # Please set your email address here to help us debug in
case of problems.
my $agent = LWP::UserAgent->new(agent => "libwww-perl $contact");
push @{$agent->requests_redirectable}, 'POST';
foreach $file (@files) {
my $response = $agent->post("$base/$tool/",
[ 'file' => [@files],
'format' => 'tab',
'from' => 'ACC+ID',
'to' => 'ACC',
'columns' => 'id,database(ko)',
],
'Content_Type' => 'form-data');
while (my $wait = $response->header('Retry-After')) {
print STDERR "Waiting ($wait)...\n";
sleep $wait;
$response = $agent->get($response->base);
}
$response->is_success ?
print $response->content :
die 'Failed, got ' . $response->status_line .
' for ' . $response->request->uri . "\n";
print $file . "\n";
}
此脚本不是循环遍历每个.txt文件,而是仅抓取我目录中的第一个.txt文件,并仅在该文件上反复执行此功能。但是,最后,它会打印正确的文件名。以下是输出示例:
Entry Cross-reference (ko) yourlist:M20170501A isomap:M201705
Q9HGN1 K16196; GCN2_SCHPO
P15442 K16196; GCN2_YEAST
P43535 K06158; GCN20_YEAST
Q5FQ97 K00851; GCNK_GLUOX
file1.txt
Entry Cross-reference (ko) yourlist:M20170501A isomap:M201705
Q9HGN1 K16196; GCN2_SCHPO
P15442 K16196; GCN2_YEAST
P43535 K06158; GCN20_YEAST
Q5FQ97 K00851; GCNK_GLUOX
file2.txt
我试图通过终端进行以下循环:
for i in *; do perl script.pl $i $i.txt; done
我得到了相同的结果。
我遗漏了一些非常简单的东西,并且要求你理解为什么这个循环是循环的。其次,有没有办法对此进行编码(在脚本中或通过终端)以不同的方式命名每个.txt文件的每个结果?
感谢 - 你!
答案 0 :(得分:1)
您的for
循环foreach $file (@files) { ... }
重复执行以下块,依次将$file
设置为每个文件名。但是在循环内部,您尝试使用参数'file' => [@files]
LWP将该列表视为文件路径,文件名以及多个标题名称和值,因此上传的数据始终来自@files
快速解决方案是用file => [ $file ]
替换该行,然后它应该可以工作,但是您的代码还有一些其他问题,所以我写了这个重构
我目前无法对此进行测试,但确实编译了
use strict;
use warnings 'all';
use LWP::UserAgent;
my @files = glob '*.txt'; # Files containg list of UniProt IDs.
my $base = 'http://www.uniprot.org';
my $tool = 'uploadlists';
my $contact = ''; # Please set your email address here
# to help us debug in case of problems.
my $agent = LWP::UserAgent->new(agent => "libwww-perl $contact");
push @{$agent->requests_redirectable}, 'POST';
for my $file ( @files ) {
my $response = $agent->post(
"$base/$tool/",
Content_Type => 'form-data',
Content => [
file => [ $file ],
format => 'tab',
from => 'ACC+ID',
to => 'ACC',
columns => 'id,database(ko)',
],
);
while ( my $wait = $response->header('Retry-After') ) {
print STDERR "Waiting ($wait)...\n";
sleep $wait;
$response = $agent->get($response->base);
}
if ( $response->is_success ) {
print $response->content;
}
else {
die sprintf "Failed. Got %s for %s\n",
$response->request->uri,
$response->status_line;
}
}