如何连接由\分隔的多行

时间:2015-08-28 12:36:17

标签: python

我想过滤以RUN开头并以\例如

分隔的行
RUN install-repository \
    "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"

RUN ...

我得到的那一刻

RUN install-repository \
RUN install-repository \
...

使用以下代码:

import urllib2

def run():

    file_url = "https://raw.githubusercontent.com/bgruening/docker-recipes/master/galaxy-exom-seq/Dockerfile"

    data = urllib2.urlopen(file_url)

    for line in data:
        if line.startswith("RUN"):
            print line.rstrip()

if __name__ == '__main__':
    run()

最好的方法是什么?

3 个答案:

答案 0 :(得分:1)

试试这个:

>>> commands = []
>>> a = StringIO.StringIO("RUN aaa\nRUNafas\nRUNaaa\n")
>>> for line in a:
...     if line.startswith("RUN"):
...         commands.append(line.rstrip())
... 
>>> print " \\\n".join(commands)
RUN aaa \ 
RUNafas \
RUNaaa

答案 1 :(得分:1)

您可以使用以下内容构建连接行列表:

lines = []
for line in open('runtest.txt'):
    if s and line.startswith('RUN'):
        lines.append(s)
        s = ''
    s += line.rstrip('\\\n')

e.g。与runtest.txt

RUN install-repository \
    "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"

RUN install-repository \
    "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"

RUN install-repository \
    "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools" \
    "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"

你得到:

RUN install-repository     "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"
RUN install-repository     "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"
RUN install-repository     "--url http://toolshed.g2.bx.psu.edu/ -o iuc --name gatk2 --panel-section-name GATK2"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name suite_samtools_0_1_19 --panel-section-name SAMTools"     "--url http://toolshed.g2.bx.psu.edu/ -o devteam --name freebayes --panel-section-name Freebayes"

答案 2 :(得分:1)

没有看到这个答案被接受,所以我会试一试。

from urllib import urlopen
import re

def run():
    re1 = '^RUN'
    re2 = '^    "--url'
    file_url = "https://raw.githubusercontent.com/bgruening/docker-recipes/master/galaxy-exom-seq/Dockerfile"
    data = urlopen(file_url)
    for line in data:
        if re.search(re1, line) or re.search(re2, line):
            print(line.rstrip('\\\n'))

if __name__ == '__main__':
    run()

我认为会得到它。