Question

我有一个这样的文件：

NA|polymerase|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo
NA|VP24|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo
NA|VP30|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo

我正在尝试从每一行打印这些字符：

polymerase|KC545393
VP24|KC545393
VP30|KC545393

我该怎么做？我试过这段代码：

for character in line:
    if character=="|":
        print line[1:i.index(j)]

Answer 1

使用str.split()按'|'字符拆分每一行;您可以限制拆分，因为您只需要前3列：

elems = line.split('|', 3)
print '|'.join(elems[1:3])

print行然后获取索引1和2处的元素，并使用'|'字符将它们再次连接在一起，以生成所需的输出。

演示：

>>> lines = '''\
... NA|polymerase|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo
... NA|VP24|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo
... NA|VP30|KC545393|Bundibugyo_ebolavirus|EboBund_112_2012|NA|2012|Human|Democratic_Republic_of_the_Congo
... '''.splitlines(True)
>>> for line in lines:
...     elems = line.split('|', 3)
...     print '|'.join(elems[1:3])
... 
polymerase|KC545393
VP24|KC545393
VP30|KC545393

Answer 2

假设您知道每行至少有两个分隔符，您可以使用：

>>> s = 'this|is|a|string'
>>> s
'this|is|a|string'
>>> s[:s.find('|',s.find('|')+1)]
'this|is'

这会找到第一个|从字符位置开始第一个|（即，它找到第二个 | }）然后给你子串，但不包括那个点。

如果它不有两个分隔符，你只需要更加小心：

s = 'blah blah'
result = s
if s.find('|') >= 0:
    if s.find('|',s.find('|')+1) >= 0:
        result = s[:s.find('|',s.find('|')+1)]

如果是这样的话，你可能肯定希望它用于更通用的功能，例如：

def substringUpToNthChar(str,n,ch):
    if n < 1: return ""
    pos = -1
    while n > 0:
        pos = str.find(ch,pos+1)
        if pos < 0: return str
        n -= 1
    return str[:pos]

这将正确处理分隔符少于预期的情况，并且（相对优雅地）处理比前两个字段更多的情况。

如何将字符串打印到特定字符？

2 个答案: