我有多个文件。我的脚本在文件中搜索序列名称和序列。如果找到,则格式从gb更改为fasta,仅保留序列名称和序列并将其写回到文件中。但是有时文件不包含序列名称。在那种情况下,我什么都没写,文件是空的。应该删除此文件,因为在我的脚本末尾,将根据所有这些文件创建一个multifasta。
# Find all gb files
$files = glob("*.gb");
foreach ($files as $filename){
$newname = basename($filename, ".gb"). ".fasta";
rename($filename, $newname);
$condition = false;
$lines = file($newname);
foreach($lines as $line) {
if (strstr($line, "ACCESSION") ) {
# Find the line containing the sequence name
$head = str_replace("ACCESSION ","",$line);
$final = "> " . $head;
# check if $head contains text
if ($head == ""){
$condition = true;
}
}
$sequence = trim($line);
# Find the sequence and check the condition
if (preg_match('/^\d/', $sequence) && $condition == false){
$sequence = preg_replace('/[0-9]+/', '', $sequence);
$sequence = preg_replace('/\s/',"",$sequence);
# Store in string
$out .= $sequence;
}
}
# Read lines into file
$f = fopen($newname, "w");
fwrite($f, $t);
fclose($f);
}
# Create multifasta
exec('for f in *fasta; do cat "$f"; echo; done > db', $return);
当文件为空时如何最好地将其删除,以免将其插入multifasta中。我敢肯定这很简单,但是我不知道该怎么做。
答案 0 :(得分:1)
我认为最简单的方法是使用filesize
命令:
if (filesize ( $filename) === 0){
unlink ($filename); //This will delete the file.
continue; //carry on with next file
}
如果unlink
命令由于某种原因无法删除文件,则会生成一条错误消息。我不知道您是否需要检查。