我试图从xml输出中提取关键字,如下所示:
http://clients1.google.com/complete/search?hl=en&output=toolbar&q=test+a
我试过把下面的内容放在一起,但我似乎没有得到任何错误或任何输出。有什么想法吗?
import urllib2 as ur
import re
f = ur.urlopen(u'http://clients1.google.com/complete/search?hl=en&output=toolbar&q=test+a')
res = f.readlines()
for d in res:
data = re.findall('<CompleteSuggestion><\/CompleteSuggestion>',d)
for i in data:
print i
file = open("keywords.txt", "a")
file.write(i + '\n')
file.close()
我想,
谢谢,
答案 0 :(得分:1)
from urllib2 import urlopen
import re
xml_url = u'http://clients1.google.com/complete/search?hl=en&output=toolbar&q=test+a'
xml_file_contents = urlopen(xml_url).readlines()
keywords_file = open("keywords.txt", "a")
for entry in xml_file_contents:
output = "\n".join(re.findall('data=\"([^\"]*)',entry))
print output
keywords_file.write(output + '\n')
keywords_file.close()
输出:
test anxiety
test america
test adobe flash
test automation
test act
test alternator
test and set
test adblock
test adobe shockwave
test automation tools
如有任何疑问,请告诉我