问题
我想将字典中的“值”变量(通过简单的csv文件创建)传递给python内的子进程sed调用,问题是我得到了一个错误:
sed:-e表达式#1,字符1:未知命令:`''
当我运行以下脚本时:
import sys
import subprocess
speciesdictfile = open("speciesfiletest.csv",'r')
file = sys.argv[1]
dict = {}
for line in speciesdictfile:
fields = line.split(',')
dict[fields[0]] = fields[1]
for line in file:
for key, value in dict.items():
if file == key:
subprocess.call(["sed", "'s/>/>" + value + "_/g'", file])
当我尝试这样做时:
subprocess.call(['sed', 's/>/>' + value + '_/g', file])
我收到以下错误:
sed:-e表达式#1,字符30:未终止的“ s”命令
示例输入
字典CSV文件:
file,Species
GCF_000006175.1_ASM617v2_genomic.faa,Methanococcus voltae
GCF_000006805.1_ASM680v1_genomic.faa,Halobacterium sp.
我要搜索和替换的文件,例如,文件名为GCF_000006175.1_ASM617v2_genomic.faa:
>NZ_LT985082.1_1_1
EQVWKSIKKYMAYYLFDTIEFMEKLFEKEFYRIVNRDSYYKNWISKFIMIN*
>NZ_LT985082.1_2_1
MKFNISKLWNPTGFFISFFMSFLMPIMFAVPFGYIPIDIFLYQQLIRWPVAYFIVTLIVI
PISLYLAKSFFTFPPTDRFFNPVTFFISLQMSFIMPFLLGYGFGSMSLNILFLMWPMRWV
VAYFMVNFAIRPLSISLARIVFNVEPQHLIIKF*
所需的输出
有效的sed命令,将行的每个实例替换为'>',并用'>'替换value变量,其后没有空格,例如:
>Methanococcus_voltae_NZ_LT985082.1_1_1
EQVWKSIKKYMAYYLFDTIEFMEKLFEKEFYRIVNRDSYYKNWISKFIMIN*
>Methanococcus_voltae_NZ_LT985082.1_2_1
MKFNISKLWNPTGFFISFFMSFLMPIMFAVPFGYIPIDIFLYQQLIRWPVAYFIVTLIVI
PISLYLAKSFFTFPPTDRFFNPVTFFISLQMSFIMPFLLGYGFGSMSLNILFLMWPMRWV
VAYFMVNFAIRPLSISLARIVFNVEPQHLIIKF*
答案 0 :(得分:0)
问题是从csv文件中提取了换行符。我用以下方法解决了这个问题:
import sys
import subprocess
speciesdictfile = open("speciesfiletest.csv",'r')
file = sys.argv[1]
dict = {}
for line in speciesdictfile:
fields = line.rstrip().split(',')
dict[fields[0]] = fields[1]
for line in file:
for key, value in dict.items():
if file == key:
subprocess.call("sed -e 's/>/>" + value + "_/g' " + file, shell=True)
行
fields = line.rstrip().split(',')
停止存储在词典中的换行符,这使它们可以在subprocess.call sed命令中使用。