Question

我正在编写一个代码，用于从大量网页中删除可见文本的选定部分。这是其中的一部分：

                divTag = soup.find_all("div", {'id':'articleBody'})
                for tag in divTag:
                    pTags = tag.find_all("p") 
                    for tag in pTags:
                        print >>f, tag.text

我如何检查Python是否找到并编写了目标文本，并将链接放在一边（列表中），如果抓取不成功？

我在这里找不到答案，我也不知道在文档中查看哪些内容。

Answer 1

这是一个替代方法，可以知道python是否找到了您要查找的文本：

import requests
from bs4 import BeautifulSoup

urls = ['https://www.google.com']
for i in range(len(urls)):
    r = requests.get(urls[i])
    soup = BeautifulSoup(r.content, 'lxml')
    items = soup.find_all('p')
    for item in items:
        if "2016 - Privacidad - Condiciones" in item.text:
            print "Python has found the targeted text"

如果python找不到text，则需要使用remove()方法。

检查Python是否已编写目标文本

1 个答案: