目标:希望将表格保存在由我的csv文件中存储的每个链接指定的每个html页面上,然后能够将表格的数据保存/打印到csv文件中。但是,在我的代码中,我有两个问题。
我希望能够添加列,以使每个页面的表都位于上一页表的右侧而不是下方。例如我想要
column1 column2 column3 column4
row1 xxpage1 valuexx xxpage2 valuexx
row2 xypage1 valuexy xypage2 valuexy
我正在得到什么
column1
row1 xxpage valuexx
xypage1 valuexy
row2 xxpage2 valuexx
xypage2 valuexy
此外,如果我想移调怎么办?当我执行df.T或df.transpose()或numpy.transpose时,出现错误,提示无法转换“列表”类型。
下面是我的代码
listofrows = []
df_links = pd.read_csv("links.csv")
links = df_links['#/itemDetail?itemId=BWKHURACAN40&uom=EA']
print(links)
numoflinks = len(links) + 1
print(numoflinks)
for i in range(0, 5):
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
prefs = {'profile.managed_default_content_settings.images':2}
chrome_options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(chrome_options=chrome_options)
print(i)
url = "http://biggestbook.com/ui/catalog.html" + links[i]
driver.get(url)
expandSigns = WebDriverWait(driver,30).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".glyphicon-plus")))
expandSigns[1].click()
WebDriverWait(driver,20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "td")))
table = driver.find_element_by_css_selector('table')
html = table.get_attribute('outerHTML')
print(html)
df = pd.read_html(html)
listofrows.append(df)
df[0].to_csv("output.csv")
print(listofrows)
for rows in listofrows:
with open('listofData.csv', 'w') as listofData:
for rows in listofrows:
rowlistwriter = csv.writer(listofData)
rowlistwriter.writerow(rows)
driver.quit()
sleep(5)
请帮助我。