Python 2.7
我试图在此页面上获取公司名称并将其保存在csv文件中。
我的代码的第一部分工作正常,但每个返回的对象(公司名称)之间有空格。
我在编写结果时也遇到了问题并将其保存在csv文件中,这让我相信它是因为它之间存在空间,因为数据'不可迭代。
有人可以帮忙修复语法吗?非常感谢!
我的代码(第一部分)
import urllib2
response = urllib2.urlopen('http://app.core-apps.com/weftec2014/exhibitors/list/A')
page = response.read()
page = page[4632:]
def get_next_target(page):
start_link = page.find("<a href='/weftec2014/exhibitors/")
if start_link == -1:
return None, 0
else:
start_place = start_link+73 #to get company names after the first <div>
end_place = page.find("</div>", start_place)
item = page[start_place:end_place]
return item, end_place
def print_all_com(page): #return company names
while True:
item, end_place = get_next_target(page)
if item:
print item
page = page[end_place:]
else:
break
data = print_all_com(page)
第二部分(CSV编写者)
import csv
with open('weftec_list.csv','w') as f:
writer = csv.writer(f)
writer.writerows(data)
错误讯息:
Traceback (most recent call last):
File "/Users/yumiyang/Documents/MCComponenet_crawler.py", line 32, in <module>
writer.writerows(data)
TypeError: writerows() argument must be iterable
答案 0 :(得分:2)
我无法测试它,但可能应该是:
def print_all_com(page): #return company names
results = []
while True:
item, end_place = get_next_target(page)
if item:
results.append( [ item.strip() ] )
#print item
page = page[end_place:]
else:
break
return results