我是蟒蛇,烧瓶和美味汤的新手。 所以这里是交易。我正在使用Beautifulsoup从网上抓取一些数据。
from bs4 import BeautifulSoup
import requests
# PageURL's configure
mainpage = 'http://www.myauto.ge/'
pageurl = 'http://www.myauto.ge/?action=search&page='
pagenum = 0
# Looping Pages. Seems Wrong but doing its job?
for x in range(0, 2):
pagenum += 1
r = requests.get(pageurl + str(pagenum))
soup = BeautifulSoup(r.content, 'html.parser')
for cars in soup.find_all('div', {'class': 'car-info-wrapper'}):
cname = cars.find("div", {"class": "car-name-wrapper"}).find('a').get_text()
cyear = cars.find("p", {"class": "cr-levy car-year"}).get_text()
ceng = cars.find("div", {"class": "cr-det-in cr-engine"}).p.get_text()
cengroad = cars.find("div", {"class": "cr-det-in cr-road"}).p.get_text()
# clink = cars.find('a').get('href')
当我打印cname,cyear,ceng和cengroad时,它的工作完全像我想要的那样。但现在我想在烧瓶中做这个。而不是在sqlite3中创建数据库,我希望它简单地刮取数据并将其解析为index.html。
这是我的app.py烧瓶代码。
# Import
from flask import Flask, render_template
import requests
from bs4 import BeautifulSoup
app = Flask(__name__)
# mainpage = 'http://www.myauto.ge/'
pageurl = 'http://www.myauto.ge/?action=search&page='
# pagenum = 0
# Our index
@app.route('/')
@app.route('/index')
def index():
# for x in range(0, 2):
# pagenum += 1
r = requests.get(pageurl)
soup = BeautifulSoup(r.content, 'html.parser')
data = []
for cars in soup.find_all('div', {'class': 'car-info-wrapper'}):
cname = cars.find("div", {"class": "car-name-wrapper"}).find('a').get_text()
data.append(cname)
datayear =[]
for cars in soup.find_all('div', {'class': 'car-info-wrapper'}):
cyear = cars.find("p", {"class": "cr-levy car-year"}).get_text()
datayear.append(cyear)
return render_template("index.html", data=data,datayear=datayear)
if __name__ == '__main__':
app.run(debug=True)
这是我的index.html
{% extends "base.html" %}
{% block body %}
<table class="table">
<thead>
<tr>
<th>Car</th>
<th>Year</th>
<th>Engine</th>
<th>Road so far</th>
</tr>
</thead>
<tbody>
<tr>
<td> {{ data }} </td>
<td> {{ datayear }} </td>
</tr>
</tbody>
</table>
{% endblock %}
如果尝试
<tr>
{% for x in data %}
<td> {{ x }} </td>
<td> </td>
</tr>
I get what i want but only for Car name
所以如何与Car year相同
<tr>
{% for x in data %}
<td> {{ carname }} </td>
<td> {{ caryear }} </td>
</tr>
或做类似的事情然后拆分列表?
data = []
for cars in soup.find_all('div', {'class': 'car-info-wrapper'}):
cname = cars.find("div", {"class": "car-name-wrapper"}).find('a').get_text()
cyear = cars.find("p", {"class": "cr-levy car-year"}).get_text()
data.append(cname)
data.append(cyear)
或者我应该尝试没有列表和字典吗?我只是不想使用db。
感谢阅读。
答案 0 :(得分:0)
你几乎就在那里,但不是在视图代码中循环两次,只需循环一次并创建汽车数据字典并将其添加到列表中,如下所示:
data = []
for cars in soup.find_all('div', {'class': 'car-info-wrapper'}):
car_info = {} # Start with an empty dictionary for each car.
car_info['name'] = cars.find("div", {"class": "car-name-wrapper"}).find('a').get_text()
car_info['year'] = cars.find("p", {"class": "cr-levy car-year"}).get_text()
car_info['engine'] = cars.find("div", {"class": "cr-det-in cr-engine"}).p.get_text()
car_info['mileage'] = cars.find("div", {"class": "cr-det-in cr-road"}).p.get_text()
data.append(car_info)
return render_template("index.html", data=data)
然后在你的模板中:
{% extends "base.html" %}
{% block body %}
<table class="table">
<thead>
<tr>
<th>Car</th>
<th>Year</th>
<th>Engine</th>
<th>Road so far</th>
</tr>
</thead>
<tbody>
{% for car_info in data %}
<tr>
<td> {{ car_info['name'] }} </td>
<td> {{ car_info['year'] }} </td>
<td> {{ car_info['engine'] }} </td>
<td> {{ car_info['mileage'] }} </td>
</tr>
{% endfor %}
</tbody>
</table>
{% endblock %}
答案 1 :(得分:0)
第二种方法应该有效,但您需要以某种方式将数据分组。
例如,尝试将它们放在元组中,而不是两行。
data.append((came, cyear))
然后在模板中,您可以提取出这些值。
{% for x in data %}
<tr>
<td> {{ x[0] }} </td>
<td> {{ x[1] }} </td>
</tr>
{% endfor %}
使用汽车的字典或对象类将是更好,更详细的方法,但这应该适用于这个简单的例子