我需要使用python脚本开始抓取此网址:https://sites.google.com/a/domain.com/sites/system/app/pages/meta/domainIndex。
如何使用带有服务帐户的OAuth2.0授权此Google Site URL。
对于OAuth1.0,我们已将请求发送至https://www.google.com/accounts/ClientLogin并提取作为令牌收到的令牌并授权该网址。
OAuth 1.0身份验证
url = 'https://www.google.com/accounts/ClientLogin'
request = urllib.urlencode({
'accountType': 'HOSTED',
'Email': 'admin@dmian.com',
'Passwd': 'userPassword',
'service': 'jotspot'})
#.. Fetch the url: https://www.google.com/accounts/ClientLogin and extract the token
headers = {
'Authorization': 'GoogleLogin auth=' + token,
'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.9.2.15) Gecko/20110303 Firefox/3.6.15 ( .NET CLR 3.5.30729)',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Host': 'sites.google.com',
'Connection': 'keep-alive'
}
# .. Fetch the Google Site url with below headers
答案 0 :(得分:1)
前段时间我为自己编写了一个用Google API处理OAuth2身份验证的课程。它可以作为你的一个例子。并且,是的,您需要注册一个“应用程序”,但这只是为了获取客户端ID / secret。
一些注意事项:
该类使用“离线”访问类型来获取将来可以存储和重复使用的凭据。
settings
变量包含存储我所有设置的类的实例,在本例中为先前获得的google api凭据。
GAuthDialog
是一个对话框,向用户显示用户/密码登录信息,并读取Google生成的代码。
execute
方法包装任何需要访问API和身份验证的方法。
该类可以如下使用,例如对于谷歌驱动器:
self._gapi = GApi(settings, 'https://www.googleapis.com/auth/drive.readonly', 'drive', 'v2')
self._gapi.execute(self.get_file_list)
然后我们有:
def get_file_list(self):
query="query"
children = self._gapi.service.files().list(q=query).execute()
以下是该类的代码:
from oauth2client.client import OAuth2WebServerFlow, Credentials, AccessTokenRefreshError
from apiclient.discovery import build
from googleapiclient.errors import HttpError
import httplib2
class GApi():
class CredentialsError(Exception):
pass
# App credentials from developers console
__client_id = ''
__client_secret = ''
# Redirect URI for installed (non-web) apps
__redirect_uri = 'urn:ietf:wg:oauth:2.0:oob'
def __init__(self, settings, scopes, service_name, version):
self.__settings = settings
self.__scopes = scopes
self.__service_name = service_name
self.__version = version
self.__service = None
self.__credentials = None
# Try restoring credentials from settings
if self.__settings.get_gapi_credentials(self.__service_name):
self.__credentials = Credentials.new_from_json(
self.__settings.get_gapi_credentials(self.__service_name))
@property
def service(self):
return self.__service
def execute(self, method, *args, **kwargs):
self.__setup()
try:
return method(*args, **kwargs)
except AccessTokenRefreshError:
pass # Will re-authenticate below
except HttpError as err:
# Rethrow since HttpError has a bug in str()
raise Exception("Response: %s, Content: %s" %
(str(err.resp), str(err.content)))
# Try re-authenticating
self.__reauthenticate()
try:
return method(**kwargs)
except HttpError as err:
# Rethrow since HttpError has a bug in str()
raise Exception("Response: %s, Content: %s" %
(str(err.resp), str(err.content)))
def __obtain_credentials(self):
# Initialize the flow
flow = OAuth2WebServerFlow(self.__client_id, self.__client_secret,
self.__scopes, redirect_uri=self.__redirect_uri)
flow.params['access_type'] = 'offline'
# Run through the OAuth flow and retrieve credentials
uri = flow.step1_get_authorize_url()
# Get code from dialog
dialog = GAuthDialog(uri)
if dialog.exec() == QtWidgets.QDialog.Accepted and dialog.auth_code:
# Get the new credentials
self.__credentials = flow.step2_exchange(dialog.auth_code)
# Set them in settings
self.__settings.set_gapi_credentials(
self.__service_name, self.__credentials.to_json())
else:
self.__credentials = None
self.__settings.set_gapi_credentials(self.__service_name, None)
def __reauthenticate(self):
self.__credentials = None
self.__service = None
self.__setup()
def __setup(self):
# Do we have credentials?
if not self.__credentials:
self.__obtain_credentials()
# Check if we got credentials
if self.__credentials:
# Do we have service?
if not self.__service:
# Create an httplib2.Http object and authorize it with our credentials
http = httplib2.Http()
http = self.__credentials.authorize(http)
self.__service = build(self.__service_name,
self.__version, http=http)
else:
raise GApi.CredentialsError