使用python

时间:2016-12-24 19:23:33

标签: python excel csv beautifulsoup python-requests

我正在尝试使用不同数字的列从网址中的表中绘制数据。我创建了一个数字列表,然后在URL的末尾输入每个数字,从每个唯一链接中绘制数据并将该数据输入到列表中。然后我将列表写入excel文件,但是当我这样做时,当每个唯一链接需要一行时,数据会写入一行。

import xlrd
from bs4 import BeautifulSoup
import requests
import csv
import urllib

sheet = xlrd.open_workbook('/Users/stevenschwab/Downloads/2016 Preliminary Assessments.xlsx')
sh = sheet.sheet_by_index(0)
numbers = sh.col_values(0)
data = []

for i in range(3,len(numbers)):
    data.append(int(numbers[i]))

for j in range(0,5):
    print(data[j])


for key in data:
    url = 'http://algonquin.northwoodsoft.com/display/PropertySearch.asp?cmd=DisplayDetails&ky= + key +'
    response = requests.get(url)
    html = response.content
    soup = BeautifulSoup(html,"html5lib")
    table = soup.find('center', attrs={'xmlns:dt':'urn:schemas-microsoft-com:datatypes'})
    rows = []

    for row in table.findAll('tr')[1:]:
        cells = []

        for cell in row.findAll('td'):
            cells.append(cell.text)
    rows.append(cells)

outfile = open('./property.csv', 'w')
writer = csv.writer(outfile)
writer.writerows([rows])

1 个答案:

答案 0 :(得分:0)

您正在循环的每次迭代期间清除rows变量。将rows = []移到您的循环之外。