这是我的python脚本的一部分:
with open("/tmp/IpList.txt", "r+") as f:
for ip in f:
page = 1
ip = ip.strip("\r\n")
print ip
while True:
url = 'http://www.bing.com/search?q=ip%3a' + ip + '&qs=n&pq=ip%3a' + ip + '&sc=0-0&sp=-1&sk=&first=' + str(page) + '&FORM=PERE'
print url
with open("/tmp/content.txt", "a") as out:
content = ContentFunc()
if "<cite>" in content and "</cite>" in content:
out.write(content)
with open("/tmp/result.txt", "a") as result:
res = CiteParser()
result.write(res.encode('utf-8'))
page += 10
else:
break
当IpList.txt
中的ips完成后,它再次进入第一个ip。为什么呢?
谢谢...
答案 0 :(得分:1)
您不需要while循环,将所有ips放入列表中:
with open("a.txt", "r+") as f:
page = 1
ips = [ip.strip("\r\n") for ip in f]
for ip in ips:
url = 'http://www.bing.com/search?q=ip%3a' + ip + '&qs=n&pq=ip%3a' + ip + '&sc=0-0&sp=-1&sk=&first=' + str(page) + '&FORM=PERE'
print url
with open("/tmp/content.txt", "a") as out:
content = ContentFunc()
if "<cite>" in content and "</cite>" in content:
out.write(content)
with open("/tmp/result.txt", "a") as result:
res = CiteParser()
result.write(res.encode('utf-8'))
page += 10
答案 1 :(得分:0)
我同意上述内容,你可以使用:
with open("/tmp/IpList.txt", "r+") as f:
for ip in f:
page = 1
ip = ip.strip("\r\n")
print ip
loop_going = True # New var <---------------
while loop_going:
url = 'http://www.bing.com/search?q=ip%3a' + ip + '&qs=n&pq=ip%3a' + ip + '&sc=0-0&sp=-1&sk=&first=' + str(page) + '&FORM=PERE'
print url
with open("/tmp/content.txt", "a") as out:
content = ContentFunc()
if "<cite>" in content and "</cite>" in content:
out.write(content)
with open("/tmp/result.txt", "a") as result:
res = CiteParser()
result.write(res.encode('utf-8'))
page += 10
else:
loop_going = False # Set to false <---------------