我已经重写了我的“死行”程序,以添加一些改进并使其与Python 3兼容。
这就是现在的样子:
import sqlite3
import re
import urllib.request
from bs4 import BeautifulSoup
import requests
import string
URL=[]
conn = sqlite3.connect('prison.sqlite')
conn.text_factory = str
cur = conn.cursor()
cur.execute("DROP TABLE IF EXISTS prison")
cur.execute("CREATE TABLE Prison (Execution text,Statement text, LastName text, Firstname text, TDCJNumber text, Age integer, Date text, Race text, County text)")
conn.commit()
url='http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html'
lines = urllib.request.urlopen(url)
prisondata = lines.read()
lines.close()
soup = BeautifulSoup(prisondata,"html.parser")
rows = soup.find_all('tr')
url2 = url[:38]
for row in rows:
td = row.find_all('td')
try:
Execution = str(td[0].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Execution) VALUES(?);", (str(Execution),))
lastname= str(td[3].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Execution) VALUES(?);", (str(Execution),))
firstname= str(td[4].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Firstname) VALUES(?);", (str(firstname),))
tdcj= str(td[5].get_text())
cur.execute("INSERT OR IGNORE INTO prison (TDCJNumber) VALUES(?);", (str(tdcj),))
Age= str(td[6].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Age) VALUES(?);", (str(Age),))
Date= str(td[7].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Date) VALUES(?);", (str(Date),))
Race= str(td[8].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Race) VALUES(?);", (str(Race),))
County= str(td[9].get_text())
cur.execute("INSERT OR IGNORE INTO prison (County) VALUES(?);", (str(County),))
links = row.find_all("a")
link = links[1].get("href")
LastStatementLink = url2 + link
lines2 = urllib.request.urlopen(LastStatementLink)
URL.append(LastStatementLink)
except Exception:
print ("An error has occured")
continue
for U in URL:
try:
r = requests.get(U)
r.raise_for_status()
print ("URL OK"), U
document = urllib.request.urlopen(U)
html = document.read()
soup = BeautifulSoup(html,"html.parser")
pattern = re.compile("Last Statement:")
Statement = soup.find(text=pattern).findNext('p').contents[0]
print (Statement.encode("utf-8"))
cur.execute("INSERT OR IGNORE INTO prison (Statement) VALUES(?);", (str(Statement),))
continue
except requests.exceptions.HTTPError as err:
print (err)
conn.commit()
conn.close()
我已经尝试了几个小时来在SQLite数据库中插入正确的数据,但我似乎无法做到正确。例如:
for row in rows:
td = row.find_all('td')
try:
Execution = str(td[0].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Execution) VALUES(?);", (str(Execution),))
程序必须在位于[0]的指定网站的表格中找到所有执行号码,并将其添加到数据库的“执行”列中。它必须对数据的所有其他部分执行此操作
问题是python似乎每次必须向数据库添加一段数据时都会创建一个新行。该程序运行,但我的SQLite数据库中有4000行而不是540行。
我尝试了很多东西,但我似乎无法将数据放入正确的列中,或者最终得到正确的行数。
我一直在考虑一种不同的方法:将数据添加到列表中,然后将它们添加到数据库中。这意味着我将有8个数据列表(每个执行号,名字,姓氏,...一个列表)但是如何将列表中的数据添加到SQLite?
谢谢!
答案 0 :(得分:0)
咳咳,这不是蟒蛇的做法,而是你的呀。你有8个插入语句,你应该有一个问题是python似乎每次创建一个新行 将一段数据添加到数据库。程序运行,但我有 我的SQLite数据库中有4000行而不是540行。
Execution = str(td[0].get_text())
cur.execute("INSERT OR IGNORE INTO prison (Execution,Firstname, ..., Race,) VALUES(?,?,?,?,?,?,?);", (str(Execution) , str(Firstname), ...))
应该这样做。注意计算所有?
,确保列出所有列名,并为所有列添加了值。
另外,请注意,变量应以小写字母开头。