Craigslist Web Scraper - 点击＆＃34;下一页＆＃34;按钮..仅返回空白列表[]

时间：2018-03-15 19:24:30

标签： python web-scraping craigslist

感谢您的光临...我非常感谢您的帮助。我试图抓住简单的craigslist列表，这段代码不能工作......请帮忙！返回空列表[] ... 请帮忙... 代码如下：

导入包

from robobrowser import RoboBrowser
import sys, codecs, locale
import pandas as pd

browser = RoboBrowser(history=True, parser='html.parser')

定义从旅行中刮取数据的功能

def getTrips(website) :
    browser.open(website)

    trips = browser.find_all(class_='result-info')

    data = []
    for trip in trips: 
        title = get_title(trip)
        url = get_url(trip)
        data.append({
            "title": title,
            "url": url,
            "website": website
        })
    next_page = browser.get_link('next >')
    if next_page:
        getTrips(browser._build_url(next_page.get('href')))
    return data

def get_title(trip):
    if trip.find(class_='result-title hdrlnk'):
        return trip.find(class_='result-title hdrlnk').text
    else:
        return "Title not found"

def get_url(trip):
    if trip.find(class_='result-info'):
        return item.find('a').get('href')
    else:
        return "URL not found"

制作聚合列表（* final output = dictionaries，列表名为＆＃34; total＆＃34;）

total = []
total.extend(getTrips('https://newyork.craigslist.org/search/bbb?query=photographer&sort=rel'))

打印（测试目的）

print(total)

将数据导出到csv文件

df = pd.DataFrame(total)
df.to_csv('photographer_data.csv', index=False)

0 个答案:

没有答案