在python中通过WC循环不工作

时间:2018-03-24 09:31:00

标签: python python-3.x

我根据youtube vid写了一个网络刮刀。它只给我一个容器,来自所有48个容器。

为什么我的代码不能自动循环遍历所有容器?我在这里想念的是什么?

    from urllib.request import urlopen as uReq
    from bs4 import BeautifulSoup as soup

    my_url = 'https://www.tradera.com/search?itemStatus=Ended&q=iphone+6+-6s+64gb+-plus'

    #
    uClient = uReq(my_url)
    page_html = uClient.read()
    uClient.close()

    #html parsing
    page_soup = soup(page_html, "html.parser")


    #Container 
    containers = page_soup.findAll("div",{"class":"item-card-details"})

    filename = "ip6.csv"
    f = open(filename, "w")

    headers = "title, link, price, bids\n"

    f.write(headers)


    for container in containers:
        title = container.div.div.h3["title"]
        link = container.div.div.h3.a["href"]

        price_container = container.findAll("span",{"class":"item-card-details-price-amount"})
        price = price_container[0].text

        bid_container = container.findAll("span",{"class":"item-card-details-bids"})
        bids = bid_container[0].text

    print("title: " + title)
    print("link: " + link)
    print("price: " + price)
    print("bids: " + bids)

    f.write(title + "," + link + "," + price + "," + bids + "\n")

    f.close

3 个答案:

答案 0 :(得分:0)

因为循环是"空"。在python中,你必须缩进应在循环内运行的代码块,例如:

for i in loop:
    # do something

在您的代码中:

for container in containers:
    title = container.div.div.h3["title"]
    link = container.div.div.h3.a["href"]

    price_container = container.findAll("span",{"class":"item-card-details-price-amount"})
    price = price_container[0].text

    bid_container = container.findAll("span",{"class":"item-card-details-bids"})
    bids = bid_container[0].text

    print("title: " + title)
    print("link: " + link)
    print("price: " + price)
    print("bids: " + bids)

    f.write(title + "," + link + "," + price + "," + bids + "\n")

    f.close

答案 1 :(得分:0)

你问我发生了什么,为什么我得到了正确的结果。下面的脚本调整为py 3.5。因为它看起来在打印线上发生了一些错误。我偶然在你的问题中修改了你的脚本。

正如Ilja所指出的那样,在我意外的部分修复之前,存在缩进错误并且正确的他提到空列表返回...我在意外修复中错过的是没有将打印语句引入for循环。所以我得到一个结果。检查网页...您想要收集所有手机产品。

下面的脚本通过在for循环中包含print-statements来修复所有问题。因此,在您的Pycharm标准输出中,您现在应该拥有许多印刷产品块。修复文件线应该在csv文件中显示类似的结果。

Py3.5 +在打印方面有点幼稚('title'+ title`)。 IMO ...样式py2.x应该保留,因为它提供了更大的灵活性并通过减少键入来降低RSI。无论如何,通过这个手机网页的迭代现在应该像pyCharm一样工作..

repr评论:不,你根本没有使用repr而且不需要它......但

有关打印语法示例,请检查here和官方python文档here

此外,我还为输出文件添加了一些格式代码。它现在应该在列中......并且可读。享受!

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup


my_url = 'https://www.tradera.com/search?itemStatus=Ended&q=iphone+6+-6s+64gb+-plus'

#
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

#html parsing
page_soup = soup(page_html, "html.parser")


#Container 
containers = page_soup.findAll("div",{"class":"item-card-details"})

filename = "ip6.csv"
f = open(filename, "w")

headers = "title, link, price, bids\n"

f.write(headers)

l1 = 0
l2 = 0
l3 = 0

# get longest entry per item for string/column-formatting
for container in containers:
    title = container.div.div.h3["title"]
    t = len(title)
    if t > l1:
        l1 = t
    link = container.div.div.h3.a["href"]


    price_container = container.findAll("span",{"class":"item-card-details-price-amount"})
    price = price_container[0].text

    p = len(price)
    if p > l2:
        l2 = p

    bid_container = container.findAll("span",{"class":"item-card-details-bids"})
    bids = bid_container[0].text

    b = len(bids)
    if b > l3:
        l3 = b

for container in containers:
    title = container.div.div.h3["title"]

    link = container.div.div.h3.a["href"]

    price_container = container.findAll("span",{"class":"item-card-details-price-amount"})
    price = price_container[0].text

    bid_container = container.findAll("span",{"class":"item-card-details-bids"})
    bids = bid_container[0].text

    # claculate distances between columns
    d1 = l1-len(title) + 0
    d2 = l2-len(price) + 1
    d3 = l3-len(bids)  + 1
    d4 = 2

    print("title : %s-%s %s." % (l1, d1, title))
    print("price : %s-%s %s." % (l2, d2, price))
    print("bids  : %s-%s %s." % (l3, d3, bids))
    print("link  : %s." % link)

    f.write('%s%s, %s%s, %s%s, %s%s\n' % (title, d1* ' ', d2* ' ', price, d3 * ' ', bids,  d4 * ' ', link))

f.close

答案 2 :(得分:0)

谢谢大家帮我解决这个问题。这是印刷线的缩进。你是最好的!