Question

我有一个名为html

的网页列表

在每个html(i)元素中，我提取了电子邮件地址。我将这些电子邮件地址放在列表中：email

我想生成一个像这样的excel文件：

为了在excel文件上记下我找到的所有电子邮件地址。

由于每个html(i)页面可能包含不同数量的电子邮件地址，因此我想编写一个代码，以自动考虑每页发现的不同数量的电子邮件。

我的想法与此类似：

#set the standard url to generate the full list of urls to be analyzed
 url = ["url1","url2", "url3", "url-n"]

#get all the url pages' html codes
 for i in range (0,len(url):
     html=[urllib.urlopen(url[i]).read() for i in range(0,len(url)) ]

#find all the emails in each html page. 
 for i in range (0,len(url):
     emails = re.findall(r'[\w\.-]+@[\w\.-]+', html[i])

#create an excel file
 wb = Workbook()

#Set the excel file. 
 for i in range (0,len(html)):
     for j in range (0, len(emails)):
         sheet1.write(i, j, emails[j])

wb.save('emails contact2.xls')

当然不行。它只写入列表html的最后一个元素中包含的电子邮件地址。有什么建议吗？

Answer 1

我不了解xlwt，但考虑到每个emails都有一个html的列表，会有类似的工作吗？

 import xlwt
 wb = Workbook()

 for html_index, html in enumerate(html):
     sheet1.write(html_index, 0, html.address)
     for email_index, email in enumerate(emails_for_html):
          sheet1.write(html_index, email_index + 1, email)

 wb.save('email contacts.xls')

请注意，我不知道xlwt特定的命令，只是想模仿你的。

Answer 2

+------+--------+
| id   | number |
+------+--------+
|  902 |    1   |
|  908 |    2   |
| 1007 |    7   |
| 1189 |    8   |
| 1233 |   12   |
| 1757 |   15   |
+------+--------+

假设您分别为每个html提取列表import xlwt wb = Workbook() sheet1 = wb.add_sheet("Sheet 1") htmls = generate_htmls() #Imaginary function to pretend it's initialized. for i in xrange(len(htmls)): sheet1.write(i, 0, htmls[i]) emails = extract_emails(htmls[i]) #Imaginary function to pretend it's extracted for j in xrange(len(emails)): sheet1.write(i, j + 1, emails[i])，此代码将html放在第1个（索引0）列中，然后将所有电子邮件放入emails（以不覆盖第一个）列）。

从python列表中在excel中写入递归数据

2 个答案: