我正在编写一个脚本来获取http响应并搜索200个响应 我需要像这样制作网址: http://exemple.com/string+number 喜欢:http://exemple.com/hello123 hello字符串在sites.txt和其他一些字符串中,我想用100到999的3个案例编号检查它们,这是代码: 好的我会解释代码:) import urllib2
import string
#declaring the range of my pool
def my_range(start, end, step):
while start <= end:
yield start
start += step
# open the file that contains the strings
s = open("sites.txt","r+")
# effect a value for url
url = 'http://exemple.com/'
# initialize y to 100 cause i only need string+(100,999)
y = 100
# for pool to read the lines line by line
for line in s.readlines():
#initialize the y variable so it wont keep increasing by 1
y = 100
# effecting the line value to site so i can use it in the 2nd for pool
site = line
for x in my_range(100, 999, 1):
url+=str(site)
#increasing y by 1 to explore all the range (100,900)
y =y+1
#url = url+site+y like http://exemple.com/hello123
url+=str(y)
#checking wich url i reach it
print url
#all this to get a 200 response
req = urllib2.Request(url)
try:
resp = urllib2.urlopen(req)
except urllib2.URLError, e:
if e.code == 404:
pass
else:
pass
else:
print "200"
#initializing the url so it wont going like this http://exemple.com/hello123hello123hello123
url = 'http://exemple.com/'
body = resp.read()
问题是脚本只获取txt文件中的第一个字符串,文件是这样的行 STRING1 字符串2 STRING3 串,4 ... 我是python的初学者所以对代码进行一些解释或重写可能非常有用,感谢任何帮助人员:)
答案 0 :(得分:2)
您发出多个请求并从每个请求获得响应,但您只能从最后一个请求read
,因为它位于循环之外。