我有一个像这样的文件名列表:
T0rain.Zfp691_0895.2_v2_deBruijn.txt
Train.Hbp1_2241.2_v2_deBruijn.txt
Train.Zfp740_0925.2_v2_deBruijn.txt
Train.Hbp1_2241.3_v1_deBruijn.txt
Train.Zfp740_0925.3_v1_deBruijn.txt
Train.Hic1_2816.2_v1_deBruijn.txt
Train.Zic1_0991.2_v1_deBruijn.txt
我想提取Train之间的所有名字。和_像这样:
Zfp691
Hbp1
Zfp740
Hbp1
zfp740
Hic1
Zic1
我还有另一个文件列表:
Zfp691.pwm.txt
Hbp1.pwm.txt
Zfp740.pwm.txt
Hbp1.pwm.txt
zfp740.pwm.txt
Hic1.pwm.txt
Zic1.pwm.txt
Zic1.pwm.RC.txt
我想提取所有匹配的案例,例如:
Train.Zic1_0991.2_v1_deBruijn.txt匹配Zic1.pwm.txt和Zic1.pwm.RC.txt
这些输出参数传递给我的R脚本。因此SH
脚本应该返回arguments =
$i (Train.Zic1_0991.2_v1_deBruijn.txt) + $j Zic1.pwm.txt
$i (Train.Zic1_0991.2_v1_deBruijn.txt) + $j Zic1.pwm.RC.txt
我不知道这是否可行。我开始尝试这个:
#!/bin/bash
for i in input/*/testtrain/Train*deBruijn.txt
do
$i
done
for j in input/All_PWMs/*/*.txt
do
$j
done
echo qsub script3.sh $i $j
这里我尝试给script3.sh提供参数,但这只返回1个组合。有人提示还是提示?比如如何匹配/ grep这些名字。或者通过参数传递的不同方式。
Script3.sh用于在linux命令行中调用R.所以args只是通过这个文件来调用一个带有debruijn和pwm组合的R作业。
r脚本需要1个debruijn.txt和1个pwm.txt才能计算出我需要的值。 因此,对于此示例,我将获得2种组合:
debruijn.txt and pwm1.txt -----> pass the args to R as combination 1
debruijn.txt and pwm2.txt -----> pass the args to R as combination 2
答案 0 :(得分:1)
的Perl:
#!/usr/bin/perl
@files=glob("*.*_*");
foreach $f (@files) {
$f =~ /^[^\.]+\.([^_]+)_/;
$pre = $1;
@f2 = glob ("$1*");
print "$f found files ".join(" ",@f2)."\n";
system ("./script.sh",$f,@f2) && die ($!);
}
提供输出:
T0rain.Zfp691_0895.2_v2_deBruijn.txt found files Zfp691.pwm.txt
Train.Hbp1_2241.2_v2_deBruijn.txt found files Hbp1.pwm.txt
Train.Hbp1_2241.3_v1_deBruijn.txt found files Hbp1.pwm.txt
Train.Hic1_2816.2_v1_deBruijn.txt found files Hic1.pwm.txt
Train.Zfp740_0925.2_v2_deBruijn.txt found files Zfp740.pwm.txt
Train.Zfp740_0925.3_v1_deBruijn.txt found files Zfp740.pwm.txt
Train.Zic1_0991.2_v1_deBruijn.txt found files Zic1.pwm.RC.txt Zic1.pwm.txt
我的“script.sh”是:
#!/bin/sh
echo Script got $0 $1 $2 $3
确保chmod 755所有脚本等