Question

我尝试使用此功能：

c=requests.get('https://www.uniberg.com/referenzen.html').text
c.count('Programmierung')

但是输出显示2次出现，而实际上没有发生。

我也尝试过：

a=requests.get('https://www.uniberg.com/index.html').text.count('Mitarbeiter')

，但它还会返回我不想要的诸如Mitarbeiterphilosophie之类的单词数。有人可以找到改善这种情况的方法还是建议另一种方法？

Answer 1

今天https://www.uniberg.com/referenzen.html含有2种植物Programmierung

我认为，您需要签入HTML源代码，而不是使用浏览器在渲染中签入。

Programmierung上的CSS字样在HTML部分中

section .detail {
    display: none;
}

第二点：

尝试此操作（使用regex）：

import re
len(re.findall(r'\WMitarbeiter\W', requests.get('https://www.uniberg.com/index.html').text))

使用正则表达式：

Answer 2

requests.get（URL）返回整个网页（在Google-Chrome上使用ctrl + U或仅使用wget下载网页即可查看），而不仅仅是Web浏览器呈现的内容。显示为2。