Question

嗨，我已经编写了一个程序来提取URL的页面，方法是提取其中的文本内容“ href”，并附加到基本URL。然后通过gspread将网址写入Google表格的单元格中。

我遇到的问题是，每当我运行该程序时，它将再次从单元格1开始。所以我想检查最高的空单元格，然后从那里运行程序。

entire_wks=gsr.fetchEnitreSheet()

        numrows=len(entire_wks.col_values(1))

        for x in range(1,numrows+1):
            col=1
            row=x
            print(x)
            chem = entire_wks.cell(x, 1).value
            for item in soup.find_all('a'):
                if chem in str(item):
                    url=base_url+item.get('href') #pulls the href from the web page
                    print("updating cell, row=",x,"with url=",url)
                    entire_wks.update_cell(x, 2, url)
                    time.sleep(1) #just to stop the sheets API getting bombarded with too frequent requests

所以我认为我需要这样的东西：

numrows=len(entire_wks.col_values(1))
last_cell= entire_wks.col(1).get_highest_row() ###I MADE THIS UP###

for x in range(last_cell,numrows+1):
#then the rest of the code to insert the new URLs into the blank cells

Google Spreadsheet的屏幕截图

有人能启发我如何解决这个问题吗？

尝试通过gspread

0 个答案: