我正在尝试通过TOR使用BS4,使用Stem项目中的To Russia With Love教程。
我使用i.a重写了一下代码。 this answer,现在看起来像这样,
SOCKS_PORT=7000
def query(url):
output = io.BytesIO()
query = pycurl.Curl()
query.setopt(pycurl.URL, url)
query.setopt(pycurl.PROXY, 'localhost')
query.setopt(pycurl.PROXYPORT, SOCKS_PORT)
query.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
query.setopt(pycurl.WRITEFUNCTION, output.write)
try:
query.perform()
return output.getvalue()
except pycurl.error as exc:
return "Unable to reach %s (%s)" % (url, exc)
def print_bootstrap_lines(line):
if "Bootstrapped " in line:
print(term.format(line, term.Color.BLUE))
print(term.format("Starting Tor:\n", term.Attr.BOLD))
tor_process = stem.process.launch_tor_with_config(
tor_cmd = '/Applications/TorBrowser.app/Contents/MacOS/Tor/tor.real',
config = {
'SocksPort': str(SOCKS_PORT),
'ExitNodes': '{ru}',
'GeoIPFile': r'/Applications/TorBrowser.app/Contents/Resources/TorBrowser/Tor/geoip',
'GeoIPv6File' : r'/Applications/TorBrowser.app/Contents/Resources/TorBrowser/Tor/geoip6'
},
init_msg_handler = print_bootstrap_lines,
)
print(term.format("\nChecking our endpoint:\n", term.Attr.BOLD))
print(term.format(query("https://www.atagar.com/echo.php"), term.Color.BLUE))
我能够建立Tor电路,但在“检查我们的端点”时,我收到以下错误,
Checking our endpoint:
Traceback (most recent call last):
File "<ipython-input-804-68f8df2c050b>", line 40, in <module>
print(term.format(query('https://www.atagar.com/echo.php'), term.Color.BLUE))
File "/Applications/anaconda/lib/python3.6/site-packages/stem/util/term.py", line 139, in format
if RESET in msg:
TypeError: a bytes-like object is required, not 'str'
我应该更改什么来查看端点?
我通过使用
更改上面代码的最后一行暂时解决了这个问题test=requests.get('https://www.atagar.com/echo.php')
soup = BeautifulSoup(test.content, 'html.parser')
print(soup)
但我想知道如何使'原始'线工作。
答案 0 :(得分:0)
当代码是为Python 2制作时,你必须使用Python 3.在Python 2中,str
和bytes
是相同的,在Python 3中,str
是Python 2的unicode
。您必须在字符串之前直接添加b
,以使其成为Python 3中的字节字符串,例如:
b"this is a byte string"