我是python脚本新手并且遇到一个小问题。请帮助解决以下问题: 如何迭代行以使关键字等于第1列中的名称以及如何将输出写入同一个Excel工作表?
谢谢。
Excel表格:
Col 1
Name1
Name2
Name3
无法使旧代码生效,所以这里是新代码。
新的url_scraper.py脚本:
import requests
from bs4 import BeautifulSoup
import xlrd
import xlwt
import pandas as pd
import xlsxwriter
book = xlrd.open_workbook("test.xlsx")
sh = book.sheet_by_index(0)
aa = sh.cell_value(rowx=0, colx=0)
df5 = pd.read_excel("test.xlsx")
writer = pd.ExcelWriter('test1.xlsx', engine='xlsxwriter')
df5.to_excel(writer, sheet_name='Sheet1', index=False, startcol=0)
print (df5)
#df = pd.read_excel("test.xlsx")
df3=df5['aa'] = "http://www.example.com/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords="+ df5.aa.astype(str)
df3.to_excel(writer, sheet_name='Sheet1', index=False, startcol=2, header=False, startrow=1)
print (df3)
book = xlrd.open_workbook("test1.xlsx")
sh = book.sheet_by_index(0)
row = 1
col = 2
aa1 = sh.cell_value(rowx=row, colx=col)
row += 1
url = aa1
response = requests.get(url)
page = str(BeautifulSoup(response.content))
start_quote = page.find("http://ecx.")
end_quote = page.find(".jpg", start_quote + 1)
url1 = page[start_quote + 0: end_quote + 4]
print (url1)
ds = pd.Series(data = url1)
df = pd.DataFrame(data = ds)
df.to_excel(writer, sheet_name='Sheet1', index=False, startcol=1, header=False, startrow=1)
根据需要输出新代码,但我无法循环。
输出新代码:
col1 col2
name1 url以name1的“http://ecx”开头
name2 url不在此处打印
name3 url不在此处打印
请帮忙。