Accessing secure information from Google Colaboratory

时间:2019-04-17 00:23:40

标签: api security authentication google-colaboratory

I am using Google Colaboratory as a platform for teaching web science (APIs, scraping, etc) in Python to very elementary programmers. I will be publishing the resulting notebooks openly and widely for use by other instructors. Colab has proven a great platform for working with open APIs, but the most popular APIs are closed, which presents difficulties.

I want my students up and running on, say, Twitter, with a minimum of friction. They can get personal keys easily enough (I guide them through it). The tough part is that I don't want their submitted solutions to me (full .ipynb files) to include private keys that they have pasted into their code. I also don't want to provide course-level keys, because that has its own risks (I know I've abused instructor provided keys before).

I am struggling to find a model for helping amateurs safely access closed APIs through Colab. My story is pedagogical, but this question is relevant for anyone who uses Colab to access closed APIs, and who might share their code, and who doesn't want to put private keys on Drive..

1 个答案:

答案 0 :(得分:0)

Writing the question out helped me solve it. My students have learned what JSON looks like, and how to play with nested dictionaries. They have already used my snippets calling colab code to upload files directly to the notebook's OS (without having to go through Drive):

import json
from google.colab import files

uploaded = files.upload()
data = json.loads( list( uploaded.values() ).pop().decode('utf-8') )

If I give them instructions for hand-authoring a .json file along these lines:

{
    'twitter' : { ... },
    'spotify' : { ... },
    'yelp'    : { ... }
}

and I tell them to keep it safe and keep it handy, then I can get them in the habit of uploading that file whenever they run a notebook that has to authenticate. I can write them other code that parses the file and sends the right key to the write platform.

This keeps me from seeing their data, it's not too unwieldy or complicated, it gives them more direct experience with JSON, it teaches them some security consciousness, it keeps them safe even if they don't care about security consciousness (I've had students who post all coursework to a public git repo; those students would get screwed quickly), and it leverages skills that are already taught or easy to teach.