我想要的只是简单地处理USPTO商标网站。
#!/usr/bin/python
import mechanize
import cookielib
br=mechanize.Browser()
cg = cookielib.LWPCookieJar()
br.set_cookiejar(cg);
#br.set_all_readonly(False)
br.set_handle_robots(False)
br.set_handle_refresh(False)
br.addheaders=[('User-agent', 'Firefox')]
response=br.open("http://uspto.gov/trademarks-application-process/search-trademark-database")
tess = 'TESS'
start_search = 'Basic Word Mark Search (New User)'
assert br.viewing_html()
print br.title()
for l in br.links(url_regex='tmsearch'):
if l.text == tess:
print l.url;
break
br.follow_link(l)
newlink=br.geturl()
print newlink
br.open(newlink)
for link in br.links():
if link.text == start_search:
print "Found Basic Search"
print link.text
print link.url
break;
**#Why do we need the contactination. Witoug this it doesn't generate a full URL**
newurl="http://tmsearch.uspto.gov" + link.url
print newurl
response1 = br.open(newurl);
print response1.read()
#for form in br.forms():
#print "Form Name" form.name
两个问题。
答案 0 :(得分:0)
好;
将您的http变量设置为变量,只需将其作为newurl = oldurl + link.url
传递,您可以随时在br.open(oldurl + "w/e goes here")
for i in response1.forms():
print "Form name:", i.name
需要选择表单,发送文本,然后点击提交..这里有一些提示:
for form in br.forms():
if form.attrs['id'] == 'search':
br.form = form
break
br["search"] = "text_search"
br.submit()