urllib2在使用特殊字符读取值时遇到问题

时间:2017-11-30 00:15:00

标签: python urllib2

我正在阅读包含以下内容的页面:

      <li class="[ c-product-detail__box ]  [ js-box ] "
                data-code="001024358-4"
                data-size="40"
                data-manufacturerSize="40"
                data-price="€ 64,90 "
                data-suggestedPrice="€ 110,00 "
                data-discount="-41%"
                data-discountGrade="DISCOUNT_GRADE_2"
                data-stockLevel="1">
                <a class="[ c-product-detail__box-link ]" >
                    <span>40</span>

                </a>
            </li>

我尝试了不同的方法:

UrlOpen = urllib2.urlopen(url)
UrlRead = UrlOpen.read()
ReadSource = UrlRead.decode('utf-8')
print ReadSource
print UrlRead
print requests.get(url).text

soup = BeautifulSoup(urllib2.urlopen(url))
tags = soup.findAll("li", { "class" : "[ c-product-detail__box ]  [ js-box ] "})
#for ...
print tag

每次我得到这个:

<li class="[ c-product-detail__box ]  [ js-box ] "
                        data-code="001024358-4"
                        data-size="40"
                        data-manufacturerSize="40"
                        data-price=""
                        data-suggestedPrice=""
                        data-discount=""
                        data-discountGrade=""
                        data-stockLevel="1">
                        <a class="[ c-product-detail__box-link ]" >

缺少部分值,大写字母降低。 你能帮帮我吗?

0 个答案:

没有答案