我有一个python webscrapping代码,如果我不在数据库中插入任何结果,它运行得很好。即当我注释掉这段代码时
“”” 连接到数据库并将数据放入 “”“
db= MySQLdb.connect("localhost","XXX","XXX","hmm_Raw_Data")
cursor=db.cursor()
#checking phase to stop scrapping
sql = """SELECT Short_link FROM RentalWanted WHERE Short_link=%s"""
rows = cursor.execute(sql,(link_result))
if rows>=1:
duplicate_count+=1
print duplicate_count
# if duplicate_count>=15:
# print "The program has started getting duplicates now- The program is terminating"
# sys.exit()
else:
query="""INSERT INTO RentalWanted
(Sale_Rent,
Type,
Area,
Nearby,
Title,
Price,
PricePerSqrFt,
Bedroom,
Agency_Fee,
Bathroom,
Size,
ZonedFor,
Freehold,
Prop_ref,
Furnished_status,
Rent_payment,
Building_info,
Amenities,
Trade_name,
Licence,
RERA_ID,
Phone_info,
Short_link)
values(
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s,
%s)"""
cursor.execute(query,(
Sale_Rent_result,
Type_result,
area_result,
nearby_result,
title_result,
price_result,
Pricepersq_result,
bedroom_result,
agencyfee_result,
bathroom_result,
size_result,
Zoned_for_result,
Freehold_result,
propertyref_result,
furnished_result,
rent_is_paid_result,
building_result,
Amenities_result,
tradename_result,
licencenum_result,
reraid_result,
phone_result,
link_result))
db.commit()
cursor.close()
db.close()
输入上述代码时出现的错误是:
Traceback (most recent call last): File "RentalWanted.py", line 461, in <module>
getting_urls_of_all_pages() File "RentalWanted.py", line 45, in getting_urls_of_all_pages
every_property_in_a_page_data_extraction(a['href']) File "RentalWanted.py", line 365, in every_property_in_a_page_data_extraction
rows = cursor.execute(sql,(link_result)) File "/usr/lib/python2.6/site-packages/MySQL_python-1.2.5-py2.6-linux-x86_64.egg/MySQLdb/cursors.py", line 187, in execute
query = query % tuple([db.literal(item) for item in args]) TypeError: not all arguments converted during string formatting
我认为我正在进行的查询有问题。
任何人都可以帮我弄清楚哪个部分需要修复。我花了几个小时但不知道我哪里错了
由于
答案 0 :(得分:0)
你真的有23个单独的变量吗?最好将所有字母放入一个字典中,以便它更清晰,属于一起,并且你不需要那么多。错误是,执行期望列表作为最后一个参数,link_result
可能是一个包含多个字符的字符串,例如包含多个元素的列表:
result = {
"Sale_Rent": Sale_Rent_result,
"Type": Type_result,
"Area": area_result,
"Nearby": nearby_result,
"Title": title_result,
"Price": price_result,
"PricePerSqrFt": Pricepersq_result,
"Bedroom": bedroom_result,
"Agency_Fee": agencyfee_result,
"Bathroom": bathroom_result,
"Size": size_result,
"ZonedFor": Zoned_for_result,
"Freehold": Freehold_result,
"Prop_ref": propertyref_result,
"Furnished_status": furnished_result,
"Rent_payment": rent_is_paid_result,
"Building_info": building_result,
"Amenities": Amenities_result,
"Trade_name": tradename_result,
"Licence": licencenum_result,
"RERA_ID": reraid_result,
"Phone_info": phone_result,
"Short_link": link_result,
}
db= MySQLdb.connect("localhost","XXX","XXX","hmm_Raw_Data")
cursor=db.cursor()
#checking phase to stop scrapping
sql = """SELECT Short_link FROM RentalWanted WHERE Short_link=%s"""
rows = cursor.execute(sql,(result["Short_link"],))
if rows>=1:
duplicate_count+=1
print duplicate_count
# if duplicate_count>=15:
# print "The program has started getting duplicates now- The program is terminating"
# sys.exit()
else:
query = """INSERT INTO RentalWanted ({fields}) VALUES ({values})"""
query = query.format(fields=','.join(result), values=','.join(['%s']*len(result)))
cursor.execute(query, result.values())
db.commit()
cursor.close()
db.close()
最好使列Short_link
唯一并捕获错误,如果您尝试插入具有相同链接的另一行,而不是手动检查约束:
db= MySQLdb.connect("localhost","XXX","XXX","hmm_Raw_Data")
cursor=db.cursor()
try:
query = """INSERT INTO RentalWanted ({fields}) VALUES ({values})"""
query = query.format(fields=','.join(result), values=','.join(['%s']*len(result)))
cursor.execute(query, result.values())
except mysql.connector.IntegrityError:
duplicate_count+=1
print duplicate_count
else:
db.commit()
cursor.close()
db.close()
答案 1 :(得分:0)
显然,在MySQL-python 1.2.5版本中存在向后兼容性问题,在调用execute时它需要一个元组而不是一个字符串。
试试这个:
rows = cursor.execute(sql,( [link_result] ))