Question

我需要读取input.txt中的行，并忽略以'>'开头的行，并读取下一行并使用网络工具以fasta格式获取输出。我已经编写了代码，但截至目前仍无法忽略'>'行，并希望以一种更简单的方式更改行名，例如给定的example（output_1.fasta）

 $i = 0 ; 
while read line:
if line: do curl -s -d "dna_sequence="$line"&output_format=fasta" https://web.expasy.org/cgi-bin/translate/dna2aa.cgi >> my_${line}.fasta; $i+1; done < 'input.txt'

input.txt
>A123
ATTGGGCCTTTT
>B1234
GGGCCCTTAAA

output_1.fasta
>A123
#entire output from the web server
GHHGGGSSSAAA

output_2.fasta
>B1234
HHJJKKLLLL

Answer 1

重击解决方案：

#!/bin/env bash
i=0
while IFS=  read -r -d $'\n'
do
  ((i++))
  curl -s -d "dna_sequence=${REPLY}&output_format=fasta" 'https://web.expasy.org/cgi-bin/translate/dna2aa.cgi' > "./output_${i}.fasta"
done < <( sed '/^>/d' "./input.txt" )
exit 0

测试：

$ cat ./input.txt
>A123
ATTGGGCCTTTT
>B1234
GGGCCCTTAAA
$ i=0
$ while IFS=  read -r -d $'\n'
> do
>   ((i++))
>   curl -s -d "dna_sequence=${REPLY}&output_format=fasta" 'https://web.expasy.org/cgi-bin/translate/dna2aa.cgi' > "./output_${i}.fasta"
> done < <( sed '/^>/d' "./input.txt" )
$ ls -1 ./output_*
./output_1.fasta
./output_2.fasta
$ cat ./output_1.fasta
> VIRT-65321:3'5' Frame 1
KRPN
> VIRT-65321:3'5' Frame 2
KGP
> VIRT-65321:3'5' Frame 3
KAQ
> VIRT-65321:5'3' Frame 1
IGPF
> VIRT-65321:5'3' Frame 2
LGL
> VIRT-65321:5'3' Frame 3
WAF
$ cat ./output_2.fasta
> VIRT-65327:3'5' Frame 1
FKG
> VIRT-65327:3'5' Frame 2
LRA
> VIRT-65327:3'5' Frame 3
-GP
> VIRT-65327:5'3' Frame 1
GPL
> VIRT-65327:5'3' Frame 2
GP-
> VIRT-65327:5'3' Frame 3
ALK

Answer 2

您现在已经接近复杂程度，不再需要使用bash，并且应该考虑将其移植到更合适的脚本语言imo ..而且您没有正确地转义$ line，如果发生什么情况，会发生什么情况？ $ line包含&foo=bar吗？ curl不会将其解释为dna_sequence的一部分，curl会认为这是一个名为foo的全新变量，其中包含bar。这是PHP的端口： / p>

#!/usr/bin/env php
<?php
$ch = curl_init();
curl_setopt_array($ch, array(
    CURLOPT_URL => 'https://web.expasy.org/cgi-bin/translate/dna2aa.cgi',
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_ENCODING => ''
));
foreach (file('input.txt', FILE_SKIP_EMPTY_LINES) as $line) {
    $line = trim($line);
    if (!strlen($line) || $line[0] === '>') {
        continue;
    }
    curl_setopt_array($ch, array(
        CURLOPT_POST => 1,
        CURLOPT_POSTFIELDS => http_build_query(array(
            'dna_sequence' => $line,
            'output_format' => 'fasta'
        ))
    ));
    file_put_contents("my_{$line}.fasta", curl_exec($ch));
}
curl_close($ch);

Answer 3

$ cat tst.sh
#!/bin/env bash

i=0
while IFS= read -r line; do
    if [[ $line =~ ^\> ]]; then
        outfile="output_((++i)).fasta"
        printf '%s\n' "$line" > "$outfile"
    else
        curl -s -d 'dna_sequence="'"$line"'"&output_format=fasta' 'https://web.expasy.org/cgi-bin/translate/dna2aa.cgi' >> "$outfile"
    fi
done < input.txt

。

$ ./tst.sh

。

$ cat output_1.fasta
>A123
> VIRT-92094:3'5' Frame 1
KRPN
> VIRT-92094:3'5' Frame 2
KGP
> VIRT-92094:3'5' Frame 3
KAQ
> VIRT-92094:5'3' Frame 1
IGPF
> VIRT-92094:5'3' Frame 2
LGL
> VIRT-92094:5'3' Frame 3
WAF

。

$ cat output_2.fasta
>B1234
> VIRT-92247:3'5' Frame 1
FKG
> VIRT-92247:3'5' Frame 2
LRA
> VIRT-92247:3'5' Frame 3
-GP
> VIRT-92247:5'3' Frame 1
GPL
> VIRT-92247:5'3' Frame 2
GP-
> VIRT-92247:5'3' Frame 3
ALK

使用终端读取文件行并执行Web操作并将输出存储为单独的文件

3 个答案: