我抓取了一个网页,以抓取某些信息,如价格,标题等。
现在我的目标是将信息插入数据库。我已经设置了数据库,其中包含所需的相应字段。
那是我的代码:
def trade_spider(max_pages):
Language = "Japanese"
partner = La
location = Tokyo
already_printed = set()
for reg in Region:
count = 0
count1 = 0
page = -1
while page <= max_pages:
page += 1
response = urllib.request.urlopen("http://www.jsox.de/s/search.json?q=" + str(reg) +"&page=" + str(page))
jsondata = json.loads(response.read().decode("utf-8"))
format = (jsondata['activities'])
g_data = format.strip("'<>()[]\"` ").replace('\'', '\"')
soup = BeautifulSoup(g_data)
articles = soup.find_all("article", {"class": "activity-card activity-card-horizontal "})
try:
connection = mysql.connector.connect\
(host = "localhost", user = "root", passwd ="", db = "crawl")
except:
print("No connection to Server")
sys.exit(0)
cursor = connection.cursor()
cursor.execute("DELETE from prices_crawled where Location=" + str(location) + " and Partner=" + str(partner))
connection.commit()
for article in articles:
headers = article.find_all("h3", {"class": "activity"})
for header in headers:
header_initial = header.text.strip()
if header_initial not in already_printed:
already_printed.add(header_initial)
header_final = header_initial
prices = article.find_all("span", {"class": "price"})
for price in prices:
price_end = price.text.strip().replace(",","")[2:]
count1 += 1
if count1 > count:
pass
else:
price_final = price_end
deeplinks = article.find_all("a", {"class": "activity-card"})
for t in set(t.get("href") for t in deeplinks):
deeplink_initial = t
if deeplink_initial not in already_printed:
already_printed.add(deeplink_initial)
deeplink_final = deeplink_initial
cursor.execute('''INSERT INTO prices_crawled (price_id, Header, Price, Deeplink, Partner, Location, Language) \
VALUES(%s, %s, %s, %s, %s, %s, %s)''', ['None'] + [header_final] + [price_final] + [deeplink_final] + [partner] + [location] + [Language])
connection.commit()
cursor.close()
connection.close()
trade_spider(int(Spider))
问题是信息无法进入数据库。此外,我没有收到任何错误消息。因此,我不知道我做错了什么。
你可以帮帮我吗?任何反馈都表示赞赏答案 0 :(得分:0)
删除声明是否有效? 我认为问题在于你传递变量的方式
更改语法如下:
sql_insert_tx =&#34; INSERT INTO euro_currencies(pk,货币,费率,日期)值(null,&#39; USD&#39;,&#39;%s&#39;,&#39;%s& #39)&#34; %(美国,日期)
cursor.execute(sql_insert_tx)