使用子进程和Awk时出现语法错误

时间:2018-02-24 09:58:23

标签: python-3.x awk subprocess

尝试运行使用Awk和Subprocess的python脚本时收到此错误:

awk: cmd. line:1: {if ($6 == +) print $1, $2-1000, $2+1000, $4, $5, $6; else print $1, $3-1000, $3+1000, $4, $5, $6}
awk: cmd. line:1:             ^ syntax error
awk: cmd. line:1: {if ($6 == +) print $1, $2-1000, $2+1000, $4, $5, $6; else print $1, $3-1000, $3+1000, $4, $5, $6}
awk: cmd. line:1:

以下是代码:

import subprocess

# sort files using bedops and then look for ERa peaks that are not within 1kb of the TSS of a gene
with open("refGenes.sorted.bed", "wb") as genes_out:
    subprocess.Popen("awk -v OFS='\t' '{print $3, $5, $6, $13, 0, $4}' /gpfs/data01/heinzlab/home/cag104/reference_data/Homo_sapiens/UCSC/hg38/Annotation/Genes/refGene.txt | sort-bed -", stdout=genes_out, shell=True)

with open("refGenes.promoter1k.sorted.bed", "wb") as promoters_out:
    subprocess.Popen("awk -v OFS='\t' '{if ($6 == '+') print $1, $2-1000, $2+1000, $4, $5, $6; else print $1, $3-1000, $3+1000, $4, $5, $6}' /gpfs/data01/heinzlab/home/cag104/projects/repeats.rosenfeld/results/04_megatrans_enhancers/refGenes.sorted.bed | sort-bed -", stdout=promoters_out, shell=True)

with open("era_peaks.filtered.bed", "wb") as era_peaks_out:
    subprocess.Popen("sort-bed /gpfs/data01/heinzlab/home/cag104/projects/repeats.rosenfeld/results/03_peak_calls/mcf7_chip_era_e2_01_peaks.narrowPeak | bedops -n 1 - /gpfs/data01/heinzlab/home/cag104/projects/repeats.rosenfeld/results/04_megatrans_enhancers/refGenes.promoter1k.sorted.bed", stdout=era_peaks_out, shell=True)

有什么想法吗?我不知道在这一点上要尝试什么。

1 个答案:

答案 0 :(得分:0)

由于引号不一致,您的第二个awk脚本被破坏了:

"awk -v OFS='\t' '{if ($6 == '+') print $1, $2-1000, $2+1000, $4, $5, $6; else print $1, $3-1000, $3+1000, $4, $5, $6}' /gpfs/data01/heinzlab/home/cag104/projects/repeats.rosenfeld/results/04_megatrans_enhancers/refGenes.sorted.bed | sort-bed -"
                 ^           ^
                 |           | 
   (start of the command) ('breaking' point) 

将关键值作为变量传递给awk脚本:

"awk -v OFS='\t' -v p='+' '{if ($6 == p) print $1, $2-1000, $2+1000, $4, $5, $6; else print $1, $3-1000, $3+1000, $4, $5, $6}' /gpfs/data01/heinzlab/home/cag104/projects/repeats.rosenfeld/results/04_megatrans_enhancers/refGenes.sorted.bed | sort-bed -"