我正在编写一个python脚本,使用python-mysql-connector在mysql数据库的表中搜索重复的条目。我希望函数在客户信息表中输出重复的条目。我不知道如何存储重复项并跟踪表中项目的索引。它们应该存储在列表还是集合中?
import mysql.connector
dbconnect = mysql.connector.connect(host='localhost', user='root', password='wordpass', db='contacts')
cur= dbconnect.cursor(buffered= True)
rows= cur.fetchall()
def find_duplicates(query):
for row in rows:
query= cur.execute ("SELECT id, name, address1, city, postal_code COUNT(*) FROM customer "
"GROUP BY name, address1, city, postal_code HAVING COUNT(*) > 1")
if row in cur.fetchone():
return row
else:
cur.fetchone()
答案 0 :(得分:1)
我认为您可以更改查询以返回完整的重复结果集。 我觉得这样的事情应该有用:
SELECT t.* FROM customer AS t
INNER JOIN (
SELECT name, address1, city, postal_code
FROM customer GROUP BY name, address1, city, postal_code
HAVING COUNT(*) > 1) AS td
ON t.name = td.name AND t.address1 = td.address1
AND t.city = td.city AND t.postal_code = td.postal_code;
一旦你掌握了所有带有ID的傻瓜,你就可以轻松地将它们放在python中。我想是。