Question

我正在尝试查找列表中以“.torrent”结尾的所有数据。为此，我使用了正则表达式。当我写这篇文章时：

for k in link_torrent2:
m=re.findall(r'\S+\.torrent',link_torrent2[k])
if m:
    link2.append(m)

我收到了一个错误：

list indices must be integers, not str

所有代码：

import re
link2=[]

link_torrent1=[u'#', u'/torrent_download/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.torrent',
    u'/category/581/', u'/torrent/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.html',
    u'/torrent_download/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.torrent', u'/category/581/',
    u'/torrent/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.html', u'/torrent_download/3795314/The.Blacklist.S02E02.HDTV.x264-ChameE.torrent']
link_torrent2=[str(x) for x in link_torrent1]

print link_torrent1

for k in link_torrent2:
    m=re.findall(r'\S+\.torrent',link_torrent2[k])  ##here shows error
    if m:
        link2.append(m)
print m

Answer 1

k 不是整数。它是link_torrent2列表中的一个元素。直接使用它：

for k in link_torrent2:
    m=re.findall(r'\S+\.torrent', k)

那是因为Python for循环真的是Foreach loops;每次迭代时，输入可迭代（link_torrent2）中的下一个元素将分配给您选择的目标，在这种情况下为k。

您可以使用str.endswith() method：

，而不是使用正则表达式

for k in link_torrent2:
    if k.endswith('.torrent'):
        link2.append(m)

或更紧凑的list comprehension：

link2 = [k for k in link_torrent2 if k.endswidth('.torrent')]

Answer 2

您不需要re.findall功能只需re.search即可。

>>> link_torrent2=[str(x) for x in link_torrent1]
>>> [i for i in link_torrent2 if re.search(r'.*\.torrent$', i)]
['/torrent_download/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.torrent', '/torrent_download/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.torrent', '/torrent_download/3795314/The.Blacklist.S02E02.HDTV.x264-ChameE.torrent']

list indices必须是整数，而不是str - 正则表达式

2 个答案: