Question

我是Python的新手，我正试图从网站上获取 alt 和图片来源，但我遇到引用问题'和"

import requests,urllib,urllib2,re

rule = re.compile(r'^[^*$<,>?!\']*$')

r = requests.get('http://www.hotstar.com/channels/star-plus')
match = re.compile('<img alt="(.*?)" ng-mouseleave="mouseLeaveCard()" ng-mouseenter="mouseEnterCard()" ng-click="mouseEnterCard(true)" ng-class="{\'dull-img\': isThumbnailTitleVisible || isRegionalLanguageVisible}" class="show-card imgtag card-minheight-hc ng-scope ng-isolate-scope" placeholder-img="{\'realUrl\' :  \'(.*?)\', \'placeholderUrl\' : \'./img/placeholder/hs.jpg\'}" ng-if="record.urlPictures" src="(.*?)" style="display: block;">',re.DOTALL).findall(r.content)
for name,img,image in match:

我只能使用标准的Python库。

我读过有关定义规则的内容，所以我从中做到了：Regex Apostrophe how to match?

老实说，我不知道如何使用它。

提前致谢

Answer 1

改为使用解析器：

import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.hotstar.com/channels/star-plus')
soup = BeautifulSoup(r.text, "lxml")
imgs = soup.findAll('img')
for img in imgs:
    print(img["alt"])

Answer 2

我快速查看了这个问题，我试图查看，并通过查看下面的链接找到了几种不同的方法。看起来这样的事情发生在其他人身上。我快速浏览了一眼，想到也许这些可能会有所帮助。试着看下面几页：

可能类似的帖子：

然后你也可以尝试查看Python's Regular Expression Documentation。

Python正则表达式找到图像源

2 个答案: