我有两个问题,但我认为解决一个问题将解决另一个问题。 我的目标是移动到几个不同的网页,并找到包含节点名称的行,我已设法为此创建一个for循环工作正常。唯一的问题是每次我的for循环再次运行它从列表中删除最后一个节点名称条目并在其位置添加新的一个,因此只留下一个节点名称在列表中。
与问题有关的完整代码
webstringy = "mycompanysite.com/?NodeID="
webpage = "mycompanysite.com/?NetworkID=36"
r2 = s2.get(webpage)
bsobjswap = BeautifulSoup(r2.content)
gotopagenums = [re.findall("\d+", i.get('onclick')) for i in bsobjswap.findAll('tr', attrs={'onclick':True})]
#link = (len(gotopagenums))
print (gotopagenums)
results = open("niki2.csv", 'w', newline='')
wr2 = csv.writer(results, dialect='excel')
for i in gotopagenums:
wr2.writerows([i])
for nodeno in gotopagenums:
nodenojoin = "".join(nodeno)
weblink = [webstringy+nodenojoin]
for weblnky in weblink:
r2 = s2.get(weblnky)
bsobjswap2 = BeautifulSoup(r2.content)
nodename = [(bsobjswap2.h1.span)]
test = [nodename]
test3 = '\n'.join(str(e) for e in test)
#if test3.startswith("[<span"):
# if test3.endswith("</span>]"):
test4 = (test3[72:])
test5 = (test4[:-9])
test5 = [test5]
print (test5)
resultfile = open("niki.csv", 'w')
wr = csv.writer(resultfile, delimiter=',', dialect='excel')
for i in test5:
wr.writerows([i])
wr.writerows('\n')
现在,当我运行这个时,第一个csv文件(niki2.csv)工作正常,我假设这是因为所有条目都在一个列表中(每个列表条目都按照我的意愿添加到单独行中的csv)
问题代码
for weblnky in weblink:
r2 = s2.get(weblnky)
bsobjswap2 = BeautifulSoup(r2.content)
nodename = [(bsobjswap2.h1.span)]
test = [nodename]
test3 = '\n'.join(str(e) for e in test)
#if test3.startswith("[<span"):
# if test3.endswith("</span>]"):
test4 = (test3[72:])
test5 = (test4[:-9])
test5 = [test5]
print (test5)
resultfile = open("niki.csv", 'w')
wr = csv.writer(resultfile, delimiter=',', dialect='excel')
for i in test5:
wr.writerows([i])
wr.writerows('\n')
我相信,这是我的问题的代码部分。当我在for循环期间打印test5列表时,我得到了
FOR LOOP OUTPUT
['GG Alperton']
['GG Angel']
['GG Ashford']
['GG Barking']
['GG Bedford']
['GG Birmingham']
['GG Bolton']
['GG Bothwell Street']
['GG Bournemouth']
['GG Bracknell']
['GG Brighton London road']
['GG Brighton Madeira']
['GG Bristol']
['GG Cardiff']
['GG Chadwell Heath']
['GG Charing Cross']
['GG Chelmsford']
['GG Colchester']
['GG Crawley']
['GG Croydon']
['GG Dartford']
['GG Derby']
['GG Ealing']
['GG East Croydon']
['GG Eastbourne']
当我在循环外打印test5时,我得到了
['GG Eastbourne']
这是最后一个条目所以当我尝试写出一个csv它只包含这个条目。
我需要请取悦请知道如何将上述所有条目放入一个列表中,以便我可以将它们正确打印到.csv。
我尝试过追加,映射,加入,越来越多的for循环,我无法理解。
从GAURAV DHAMA输出
[['GG Alperton']]
[['GG Angel']]
[['GG Ashford']]
[['GG Barking']]
[['GG Bedford']]
[['GG Birmingham']]
[['GG Bolton']]
[['GG Bothwell Street']]
[['GG Bournemouth']]
[['GG Bracknell']]
[['GG Brighton London road']]
[['GG Brighton Madeira']]
[['GG Bristol']]
[['GG Cardiff']]
[['GG Chadwell Heath']]
[['GG Charing Cross']]
[['GG Chelmsford']]
[['GG Colchester']]
[['GG Crawley']]
[['GG Croydon']]
[['GG Dartford']]
[['GG Derby']]
[['GG Ealing']]
[['GG East Croydon']]
[['GG Eastbourne']]
答案 0 :(得分:0)
将代码更改为:
mylist = []
for nodeno in gotopagenums:
nodenojoin = "".join(nodeno)
weblink = webstringy+nodenojoin
r2 = s2.get(weblink)
bsobjswap2 = BeautifulSoup(r2.content)
nodename = [(bsobjswap2.h1.span)]
test = [nodename]
test3 = '\n'.join(str(e) for e in test)
#if test3.startswith("[<span"):
# if test3.endswith("</span>]"):
test4 = (test3[72:])
test5 = (test4[:-9])
test5 = [test5]
mylist.append(test5)
print mylist
resultfile = open("niki.csv", 'w')
wr = csv.writer(resultfile, delimiter=',', dialect='excel')
for i in mylist:
wr.writerow(i)
为什么要在代码中创建不必要的列表,不需要内部循环。