我正在尝试通过Nifi使用流式推文。在InvokeHTTP
处理器出现问题之后,我选择使用ExecuteScript
处理器和tweepy做我的工作。
我想在流媒体中获取推文,并将它们逐个分发给我的Nifi流程的其余部分。
为此,我编写了以下脚本:
from tweepy import Stream, API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from org.apache.nifi.processors.script import ExecuteScript
# All API keys / access token
consumer_key = "something"
consumer_secret_key = "something"
access_token = "something"
access_token_secret = "something"
# Initialize du proxy
proxies = {
"http": "http_proxy",
"https": "https_proxy"
}
# Listener class that contains all API functions for streaming tweets.
class Listener(StreamListener):
def __init__(self, nifi_session):
super(Listener, self).__init__() # Overwrite constructor for declare flowfile NiFi
self.nifi_session = nifi_session
def on_data(self, status):
# Convert string to json
data = json.loads(status)
# extract relevant information, for example, we use user's description only
description = data['user']['description']
# Give all this parameters at nifi
session.putAttribute(self.nifi_session, 'data', description)
session.transfer(self.nifi_session, ExecuteScript.REL_SUCCESS)
session.commit() # Commit to next nifi processor
def on_error(self, status):
if status == 420:
return False
# Set flowfile NiFi
flowFile = session.get()
# Set OAuth with keys and tokens
auth = OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
listener = Listener()
twitterStream = Stream(api.auth, listener=listener, proxies=proxies)
# Try/catch for be sure that stream will be disconnect after error
try:
twitterStream.filter(track=['nasa'])
except Exception as e:
print("Exception !")
finally:
print("...end")
twitterStream.disconnect()
twitterStream.disconnect()
通过此过程,Nifi上不会发生任何事情。但是,我在没有flowfile
和Nifi的session
的情况下在其他环境下进行了测试,并且可以使用。我想问题出在与Nifi通讯的语法中。
编辑:我不能使用GetTwitter
处理器,因为它不支持代理,我必须与我合作。
感谢帮助!