我正在尝试查找列表中以“.torrent”结尾的所有数据。为此,我使用了正则表达式。当我写这篇文章时:
for k in link_torrent2:
m=re.findall(r'\S+\.torrent',link_torrent2[k])
if m:
link2.append(m)
我收到了一个错误:
list indices must be integers, not str
所有代码:
import re
link2=[]
link_torrent1=[u'#', u'/torrent_download/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.torrent',
u'/category/581/', u'/torrent/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.html',
u'/torrent_download/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.torrent', u'/category/581/',
u'/torrent/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.html', u'/torrent_download/3795314/The.Blacklist.S02E02.HDTV.x264-ChameE.torrent']
link_torrent2=[str(x) for x in link_torrent1]
print link_torrent1
for k in link_torrent2:
m=re.findall(r'\S+\.torrent',link_torrent2[k]) ##here shows error
if m:
link2.append(m)
print m
答案 0 :(得分:2)
k
不是整数。它是link_torrent2
列表中的一个元素。直接使用它:
for k in link_torrent2:
m=re.findall(r'\S+\.torrent', k)
那是因为Python for
循环真的是Foreach loops;每次迭代时,输入可迭代(link_torrent2
)中的下一个元素将分配给您选择的目标,在这种情况下为k
。
您可以使用str.endswith()
method:
for k in link_torrent2:
if k.endswith('.torrent'):
link2.append(m)
或更紧凑的list comprehension:
link2 = [k for k in link_torrent2 if k.endswidth('.torrent')]
答案 1 :(得分:0)
您不需要re.findall
功能只需re.search
即可。
>>> link_torrent2=[str(x) for x in link_torrent1]
>>> [i for i in link_torrent2 if re.search(r'.*\.torrent$', i)]
['/torrent_download/3797378/THE+BLACKLIST+%282014%29+S02E02+x264+1080p%28WEB-DL%29+eng+NLsubs+TBS.torrent', '/torrent_download/3795431/The+Blacklist+S02E02+720p+HDTV+x264+AAC+-+Ozlem.torrent', '/torrent_download/3795314/The.Blacklist.S02E02.HDTV.x264-ChameE.torrent']