Question

我正在遍历img中的所有request.POST，以查看它们是否是HTTPS（我正在使用Beautiful Soup来提供帮助）

这是我的代码：

content = request.POST['content']
print(content) #prints:
<p>test test test</p><br><p><img src="https://www.treefrogfarm.com/store/images/source/IFE_A-K/ClarySage2.jpg" alt=""></p><br><p>2nd 2nd</p><br><p><img src="https://www.treefrogfarm.com/store/images/source/IFE_A-K/ClarySage2.jpg" alt=""></p>

soup = BeautifulSoup(content, 'html.parser')
for image in soup.find_all('img'):
    print('Source:', image.get('src')[:8]) #prints Source: https://
    if image.get('src')[:7] == "https://":
        print('HTTPS')
    else:
        print('Not HTTPS')

即使image.get('src')[:7] == "https://"，代码仍会打印Not HTTPS。

知道为什么吗？

Answer 1

对于初学者来说，'https://'是8个字符，所以不可能有7个字符的切片与之匹配。

另外，请让您的问题标题实际上表示您遇到的问题，而不是与python运算符无关的指控。

Answer 2

为匹配https://字符串，相应的切片将为:8而不是:7

if image.get('src')[:8] == "https://":

使用BeautifulSoup，Python查找HTTPS图像

2 个答案: