好的, 因此,我正在从源端和目标端比较sql查询的数量,我需要在每个工作表中存储每个查询的差异,为此我在努力。 我正在使用以下方法:
def running_queries_from_list(self):
source_getting_list = self.create_list(1)
getting_list = self.create_list(2)
source_db = self.connection_String(1)
target_db = self.connection_String(2)
for i in range(0, len(self.queries), 2):
source_results = pd.read_sql(self.queries[i], source_db)
target_results = pd.read_sql(self.queries[i], target_db)
if source_results.equals(target_results):
print("Results are same on " +"' "+self.queries[i]+"'"+" Query")
else:
global difference
difference = pd.concat([source_results, target_results]).drop_duplicates(keep=False)
print("Data is different at " +"' "+self.queries[i]+"'"+" Query")
difference_Sheet = os.path.abspath(os.getcwd() + '/Docs/Difference_sheet.xlsx')
writer = pd.ExcelWriter(difference_Sheet)
for d in difference:
difference.to_excel(writer, sheet_name=str(difference(d)+1))
writer.save()
print(difference)
if not difference.empty:
raise AssertionError ("Data is Different on Multiple Tables")
我的数据框结果如下:
Data is different at ' SELECT * FROM country' Query
country_id ... last_update
0 1 ... 2006-02-15 04:44:00
1 2 ... 2006-02-15 04:44:00
2 3 ... 2006-02-15 04:44:00
3 4 ... 2006-02-15 04:44:00
4 5 ... 2006-02-15 04:44:00
14 15 ... 2006-02-15 04:44:00
15 16 ... 2006-02-15 04:44:00
16 17 ... 2006-02-15 04:44:00
17 18 ... 2006-02-15 04:44:00
18 19 ... 2006-02-15 04:44:00
19 20 ... 2006-02-15 04:44:00
20 21 ... 2006-02-15 04:44:00
21 22 ... 2006-02-15 04:44:00
22 23 ... 2006-02-15 04:44:00
23 24 ... 2006-02-15 04:44:00
24 25 ... 2006-02-15 04:44:00
25 26 ... 2006-02-15 04:44:00
26 27 ... 2006-02-15 04:44:00
27 28 ... 2006-02-15 04:44:00
28 29 ... 2006-02-15 04:44:00
29 30 ... 2006-02-15 04:44:00
.. ... ... ...
79 80 ... 2006-02-14 23:44:00
80 81 ... 2006-02-14 23:44:00
81 82 ... 2006-02-14 23:44:00
104 105 ... 2006-02-14 23:44:00
105 106 ... 2006-02-14 23:44:00
106 107 ... 2006-02-14 23:44:00
107 108 ... 2006-02-14 23:44:00
108 109 ... 2006-02-14 23:44:00
[218 rows x 3 columns]
Data is different at ' SELECT * FROM film' Query
film_id ... last_update
0 1 ... 2006-02-15 05:03:42
1 2 ... 2006-02-15 05:03:42
2 3 ... 2006-02-15 05:03:42
3 4 ... 2006-02-15 05:03:42
4 5 ... 2006-02-15 05:03:42
5 6 ... 2006-02-15 05:03:42
6 7 ... 2006-02-15 05:03:42
7 8 ... 2006-02-15 05:03:42
8 9 ... 2006-02-15 05:03:42
9 10 ... 2006-02-15 05:03:42
10 11 ... 2006-02-15 05:03:42
27 28 ... 2006-02-15 05:03:42
28 29 ... 2006-02-15 05:03:42
29 30 ... 2006-02-15 05:03:42
.. ... ... ...
970 971 ... 2006-02-15 00:03:42
998 999 ... 2006-02-15 00:03:42
999 1000 ... 2006-02-15 00:03:42
[2000 rows x 13 columns]
Data is different at ' SELECT * FROM film_category' Query
现在,我想在excel工作簿的每张纸上显示每个查询的差异。