Python 3 - 逐行阅读html并找到正确的单词

时间:2015-07-24 23:04:09

标签: python

import urllib.request
url = "site.com"
request = urllib.request.Request(url)
my = urllib.request.urlopen(request)
print (my.read().decode('utf-8'))

例如,我使用此代码获取第55行到第70行的源代码,然后使用if语句在本节中找到特定的单词。

2 个答案:

答案 0 :(得分:2)

从55到70获取行:

lines = my.read().decode('utf-8').split("\n")[55:70]

找到一些东西:

for line in lines:
    index = line.find(something)
    if index > -1:
        # ...

然后你找到的是line[index:index + len(something)]

答案 1 :(得分:0)

finding = '<sometag>'
text = my.read().decode('utf-8').splitlines()[54:70] # Include Line 55
pos = text.find(finding)
if pos != -1:
    # Do what you need to