ExecuteScript Nifi的Python良好语法是什么?

时间:2019-11-12 10:13:06

标签: python apache-nifi tweepy twitter-streaming-api

我正在尝试通过Nifi使用流式推文。在InvokeHTTP处理器出现问题之后,我选择使用ExecuteScript处理器和tweepy做我的工作。

我想在流媒体中获取推文,并将它们逐个分发给我的Nifi流程的其余部分。

为此,我编写了以下脚本:

from tweepy import Stream, API
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json

import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
from org.apache.nifi.processors.script import ExecuteScript

# All API keys / access token
consumer_key = "something"
consumer_secret_key = "something"
access_token = "something"
access_token_secret = "something"

# Initialize du proxy
proxies = {
    "http": "http_proxy",
    "https": "https_proxy"
}


# Listener class that contains all API functions for streaming tweets.
class Listener(StreamListener):

    def __init__(self, nifi_session):
        super(Listener, self).__init__() # Overwrite constructor for declare flowfile NiFi
        self.nifi_session = nifi_session

    def on_data(self, status):
        # Convert string to json
        data = json.loads(status)

        # extract relevant information, for example, we use user's description only
        description = data['user']['description']

        # Give all this parameters at nifi
        session.putAttribute(self.nifi_session, 'data', description)
        session.transfer(self.nifi_session, ExecuteScript.REL_SUCCESS)
        session.commit() # Commit to next nifi processor

    def on_error(self, status):
        if status == 420:
            return False


# Set flowfile NiFi
flowFile = session.get()

# Set OAuth with keys and tokens
auth = OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

listener = Listener()
twitterStream = Stream(api.auth, listener=listener, proxies=proxies)

# Try/catch for be sure that stream will be disconnect after error
try:
    twitterStream.filter(track=['nasa'])
except Exception as e:
    print("Exception !")
finally:
    print("...end")
    twitterStream.disconnect()

twitterStream.disconnect()

通过此过程,Nifi上不会发生任何事情。但是,我在没有flowfile和Nifi的session的情况下在其他环境下进行了测试,并且可以使用。我想问题出在与Nifi通讯的语法中。

编辑:我不能使用GetTwitter处理器,因为它不支持代理,我必须与我合作。

感谢帮助!

0 个答案:

没有答案