我正在尝试从特定国家/地区获取推文。我正在使用tweepy api来获取推文。这是我到目前为止的代码 -
api = tweepy.API(auth)
places = api.geo_search(query="India", granularity="country")
place_id = places[0].id
public_tweets = api.search(q="place:%s" % place_id)
for one in public_tweets:
print(one.place)
以下是我获取上述代码段的结果 -
None
None
Place(_api=<tweepy.api.API object at 0x1033f7690>, country_code=u'IN', url=u'https://api.twitter.com/1.1/geo/id/243cc16f6417a167.json', country=u'India', place_type=u'city', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x1033f7690>, type=u'Polygon', coordinates=[[[78.3897718, 17.3013989], [78.5404168, 17.3013989], [78.5404168, 17.4759], [78.3897718, 17.4759]]]), contained_within=[], full_name=u'Hyderabad, Andhra Pradesh', attributes={}, id=u'243cc16f6417a167', name=u'Hyderabad')
Place(_api=<tweepy.api.API object at 0x1033f7690>, country_code=u'IN', url=u'https://api.twitter.com/1.1/geo/id/1b8680cd52a711cb.json', country=u'India', place_type=u'city', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x1033f7690>, type=u'Polygon', coordinates=[[[77.3734736, 12.9190365], [77.7393706, 12.9190365], [77.7393706, 13.2313813], [77.3734736, 13.2313813]]]), contained_within=[], full_name=u'Bengaluru, Karnataka', attributes={}, id=u'1b8680cd52a711cb', name=u'Bengaluru')
None
None
None
None
None
None
Place(_api=<tweepy.api.API object at 0x1033f7690>, country_code=u'IN', url=u'https://api.twitter.com/1.1/geo/id/1dc2b546652c55dd.json', country=u'India', place_type=u'admin', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x1033f7690>, type=u'Polygon', coordinates=[[[73.8853747, 29.5438816], [76.9441213, 29.5438816], [76.9441213, 32.5763957], [73.8853747, 32.5763957]]]), contained_within=[], full_name=u'Punjab, India', attributes={}, id=u'1dc2b546652c55dd', name=u'Punjab')
Place(_api=<tweepy.api.API object at 0x1033f7690>, country_code=u'IN', url=u'https://api.twitter.com/1.1/geo/id/1dc2b546652c55dd.json', country=u'India', place_type=u'admin', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x1033f7690>, type=u'Polygon', coordinates=[[[73.8853747, 29.5438816], [76.9441213, 29.5438816], [76.9441213, 32.5763957], [73.8853747, 32.5763957]]]), contained_within=[], full_name=u'Punjab, India', attributes={}, id=u'1dc2b546652c55dd', name=u'Punjab')
None
None
Place(_api=<tweepy.api.API object at 0x1033f7690>, country_code=u'IN', url=u'https://api.twitter.com/1.1/geo/id/1b8680cd52a711cb.json', country=u'India', place_type=u'city', bounding_box=BoundingBox(_api=<tweepy.api.API object at 0x1033f7690>, type=u'Polygon', coordinates=[[[77.3734736, 12.9190365], [77.7393706, 12.9190365], [77.7393706, 13.2313813], [77.3734736, 13.2313813]]]), contained_within=[], full_name=u'Bengaluru, Karnataka', attributes={}, id=u'1b8680cd52a711cb', name=u'Bengaluru')
大部分推文都没有地理标记。如何确保结果中只显示地理标记的推文?
答案 0 :(得分:1)
我也遇到过这个问题,其中推文的实际地理代码总是丢失。但是,您不应该需要每条推文的实际地理代码来满足您的要求;相反,您可以搜索特定地理区域内的推文,指定坐标和半径,如下所示:
def wordsearch(word, max_tweets, lang, geocode, since, out):
# Query for 100 tweets that have word in them and store it in a list
searched_tweets = [status for status in tweepy.Cursor(api.search, n=max_tweets, q=word, lang=lang, geocode=geocode, since=since).items(max_tweets)]
print("Number of Matches: %d\n" % len(searched_tweets))
csvfile = open(out, 'a')
csvWriter = csv.writer(csvfile)
for t in searched_tweets:
csvWriter.writerow([t.created_at, t.text.encode('utf-8'), t.author.screen_name, t.place, t.retweeted, t.retweet_count, (not t.retweeted and 'RT @' not in t.text)])
csvfile.close()
wordsearch('dead', 100, "en", "37.9,91.8,1000mi", "2017-01-01", "result.csv")
答案 1 :(得分:0)
你以错误的方式接近这个。这两个功能不会那样工作。
首先查看Twitter文档:
这是使用可以附加的查找位置的推荐方法 状态/更新。
返回与指定查询匹配的相关推文的集合
如果您需要地理编码的推文,则可以使用geocode
中的GET search/tweets
功能来限制位置以获取推文。这将为您提供该位置的所有推文,一旦您获得这些推文,您就可以过滤地理编码的推文。
过滤器必须在您的最终完成,而不是Twitter。