无法弄清楚此错误消息。有人可以帮忙吗? re.findall行上出现错误。
import re, urllib.request
infile = open('phone_numbers.txt')
for line in infile:
line = line.strip()
area=line[0:3]
area1=line[5:7]
area2=line[8:12]
xyz = 'http://usreversephonedirectory.com/results.php?areacode='+ area +'&phone1='+ area1 +'&phone2='+ area2 +'&imageField.x=193&imageField.y=16&type=phone&Search=Search&redir_page=results%2Fphone%2F'
print(area + area1 + area2)
page = urllib.request.urlopen(xyz)
text = page.read()
text = text.strip()
location = re.findall('>Location:</strong>(.+)</span><br/> <span><strong>Line', text)
print(line + '|' + location[0])
infile.close()
答案 0 :(得分:1)
正如@Ben所说,您的文本将被视为二进制文件。使用他的解码方法text.strip()
,错误消失了。我使用的方法如下。您可能希望从美学角度修复它的输出。希望这有帮助!
$ echo "1 (800) 233-2742" >> phone_numbers.txt # Put a random number into phone_numbers.txt
$ python lookup.py # Run the fixed program
1 (0)233- # Output line 1
1 (800) 233-2742| , # Output line 2
$ # Done
代码(已更新):
import re, urllib.request
infile = open('phone_numbers.txt')
for line in infile:
line = line.strip()
area=line[0:3]
area1=line[5:7]
area2=line[8:12]
xyz = 'http://usreversephonedirectory.com/results.php?areacode='+ area +'&phone1='+ area1 +'&phone2='+ area2 +'&imageField.x=193&imageField.y=16&type=phone&Search=Search&redir_page=results%2Fphone%2F'
print(area + area1 + area2)
page = urllib.request.urlopen(xyz)
text = page.read()
text = text.strip().decode('utf-8')
location = re.findall('>Location:</strong>(.+)</span><br/> <span><strong>Line', text)
print(line + '|' + location[0])
infile.close()