Question

在审查用于构建hs表中的csv的解决方案时，我偶然发现了这段代码

ol = map（cell_text，row.find_all （re.compile（'t [dh]'）））

粗体文字到底发生了什么？ find_all调用html元素和标签。粗体文本如何实现？

上下文

#!/usr/bin/python
from bs4 import BeautifulSoup
import sys
import re
import csv

def cell_text(cell):
    return " ".join(cell.stripped_strings)

soup = BeautifulSoup(sys.stdin.read())
output = csv.writer(sys.stdout)

for table in soup.find_all('table'):
    for row in table.find_all('tr'):
        col = map(cell_text, row.find_all(re.compile('t[dh]')))
        output.writerow(col)
    output.writerow([])

Answer 1

它发现所有t后跟d或h。 re.compile只返回find_all消耗的“已编译”正则表达式对象。

这是re.compile的文档;和BeautifulSoup的find_all可以采用正则表达式;这是一个sample from the documentation：

for tag in soup.find_all(re.compile("^b")):
    print(tag.name)

使用非常相似，如你所见

re.compile如何在BeautifulSoup中执行find_all函数？

1 个答案: