Question

我创建了一个数组，其中包含所有可能的html标记，该标记在运行时将导航至目标字符串或不显示任何内容。

group = ['div','span','a','link','dl','dt','dd','b','p','meta','']
comb = []

for g1 in group:
    if g1 != '':
        for g2 in group:
            if g2 != '':
                for g3 in group:
                    if g3 != '':
                        res = "tag."+g1+"."+g2+"."+g3+".string"
                        comb.append(res)
                    else:
                        res = "tag."+g1+"."+g2+".string"
                        comb.append(res)
            else:
                res = "tag."+g1+".string"
                comb.append(res)

我想运行数组中的每个条目，以查看其从给定网站返回的内容。

def get_web_price(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}

    page = requests.get(url, headers=headers)
    soup = BeautifulSoup(page.content, "lxml")
    tag = soup.find(class_=re.compile("price"))

    for c in comb:
        exec(c, globals())

是否可以像exec()一样在列表中运行字符串？我在Python 3上使用BeautifulSoup，Requests，Googlesearch和Re

Answer 1

您不需要exec()或eval()来进行动态属性访问，请使用getattr()，或者在BeautifulSoup的情况下，使用方法find()获取第一个符合指定条件的孩子：

from itertools import chain, product

group = ['div','span','a','link','dl','dt','dd','b','p','meta']
# Produce a list of tuples of element names
comb = list(chain(*[product(*[group] * n) for n in range(1, 4)]))

def get_web_price(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}

    page = requests.get(url, headers=headers)
    soup = BeautifulSoup(page.content, "lxml")
    tag = soup.find(class_=re.compile("price"))

    for c in comb:
        t = tag
        for a in c:
            t = t.find(a)
            if not t:
                break

        if not t:
            continue

        # Do something with t.string
        t.string

我认为您也可以使用select()来达到相同效果的限制：

def get_web_price(url):
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}

    page = requests.get(url, headers=headers)
    soup = BeautifulSoup(page.content, "lxml")
    tag = soup.find(class_=re.compile("price"))

    for c in comb:
        selector = ' '.join(c)
        r = tag.select(selector, limit=1)
        if r:
            r = r[0]

        else:
            continue

        r.string

关于抓取Google搜索结果是否是一个好主意，我没有立场。

有没有一种方法可以在for循环中执行字符串？

1 个答案: