我想编写一段代码来抓取特定位置(例如某个城市)的所有推文,然后通过python将它们放入MongoDB中。作为一个完整的编程新手,我设法从Twitter跟踪特定的主题标签,并使用以下代码将它们存储在MongoDB中:
01 import pycurl, json
02 import pymongo
03
04 STREAM_URL = "https://stream.twitter.com/1/statuses/filter.json"
05 WORDS = "track=#occupywallstreet"
06 USER = "myuser"
07 PASS = "mypass"
08
09 def on_tweet(data):
10 try:
11 tweet = json.loads(data)
12 db.posts.insert(tweet)
13 print tweet
14 except:
15 return
16
17 from pymongo import Connection
18 connection = Connection()
19 db = connection.occupywallstreet
20 conn = pycurl.Curl()
21 conn.setopt(pycurl.POST, 1)
22 conn.setopt(pycurl.POSTFIELDS, WORDS)
23 conn.setopt(pycurl.HTTPHEADER, ["Connection: keep-alive", "Keep-Alive: 3000"])
24 conn.setopt(pycurl.USERPWD, "%s:%s" % (USER, PASS))
25 conn.setopt(pycurl.URL, STREAM_URL)
26 conn.setopt(pycurl.WRITEFUNCTION, on_tweet)
27 conn.perform()
如何跟踪地理定位的推文,即来自特定城市的推文?有没有办法可以改变上面的代码以满足我的需要?
谢谢!
答案 0 :(得分:1)
在这种情况下,您应该使用locations参数:
import pycurl
import json
STREAM_URL = "https://stream.twitter.com/1/statuses/filter.json"
LOCATIONS = "locations=-74,40,-73,41" # New York
USER = "myuser"
PASS = "mypass"
def on_tweet(data):
try:
tweet = json.loads(data)
db.posts.insert(tweet)
print tweet
except:
return
from pymongo import Connection
connection = Connection()
db = connection.occupywallstreet
conn = pycurl.Curl()
conn.setopt(pycurl.POST, 1)
conn.setopt(pycurl.POSTFIELDS, LOCATIONS)
conn.setopt(pycurl.HTTPHEADER, ["Connection: keep-alive", "Keep-Alive: 3000"])
conn.setopt(pycurl.USERPWD, "%s:%s" % (USER, PASS))
conn.setopt(pycurl.URL, STREAM_URL)
conn.setopt(pycurl.WRITEFUNCTION, on_tweet)
conn.perform()
希望有所帮助。