我正在使用python脚本抓取一个页面(例如,facebook页面),并希望传递每个帖子以写入文件(类似于gettwitter进程)。我正在使用Apache Nifi ExecuteScript处理器来调用python脚本。
但是,我得到了SyntaxError: no viable alternative at input
我可以在处理器之间成功传输数据但是当我尝试添加报废代码时,我不断收到错误。我使用的是python版本2.7.8。据我所知,executeScript在内部使用jython,jython可以转换python代码。
如果我们删除与nifi相关的代码(内部类和流文件),下面是在nifi外部使用python的代码。
import urllib2
import json
import datetime
import csv
import time
import sys
import traceback
from org.apache.nifi.processor.io import OutputStreamCallback
from org.python.core.util import StringUtil
class WriteContentCallback(OutputStreamCallback):
def __init__(self, content):
self.content_text = content
def process(self, outputStream):
try:
outputStream.write(StringUtil.toBytes(self.content_text))
except:
traceback.print_exc(file=sys.stdout)
raise
#app_id = "<FILL IN>"
#app_secret = "<FILL IN>" # DO NOT SHARE WITH ANYONE!
page_id = "dsssssss"
#page_id = raw_input("Please Paste Public Page Name:")
#access_token = app_id + "|" + app_secret
access_token = "sdfsdfsf%sdfsdf"
#access_token = raw_input("Please Paste Your Access Token:")
def scrapeFacebookPageFeedStatus(page_id, access_token):
flowFile = session.create()
flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data")
#flowFile = session.write()
#session.transfer(flowFile, REL_SUCCESS)
has_next_page = False
num_processed = 0 # keep a count on how many we've processed
scrape_starttime = datetime.datetime.now()
while has_next_page:
print "Scraping %s Facebook Page: %s\n" % (page_id, scrape_starttime)
has_next_page = False
print "\nDone!\n%s Statuses Processed in %s" % \
(num_processed, datetime.datetime.now() - scrape_starttime)
if __name__ == '__main__':
#scrapeFacebookPageFeedStatus(page_id, access_token)
flowFile = session.create()
flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data")
session.transfer(flowFile, REL_SUCCESS)
以下是错误的完整描述
> 2017-04-02 16:19:22,817 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=a62f4b97-8fd7-15cd-95b9-505e1b960805]
> ExecuteScript[id=a62f4b97-8fd7-15cd-95b9-505e1b960805] failed to
> process due to org.apache.nifi.processor.exception.ProcessException:
> javax.script.ScriptException: SyntaxError: no viable alternative at
> input 'has_next_page' in <script> at line number 178 at column number
> 8; rolling back session:
> org.apache.nifi.processor.exception.ProcessException:
> javax.script.ScriptException: SyntaxError: no viable alternative at
> input 'has_next_page' in <script> at line number 178 at column number
> 8 2017-04-02 16:19:22,819 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> org.apache.nifi.processor.exception.ProcessException:
> javax.script.ScriptException: SyntaxError: no viable alternative at
> input 'has_next_page' in <script> at line number 178 at column number
> 8
> at org.apache.nifi.processors.script.ExecuteScript.onTrigger(ExecuteScript.java:214)
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099)
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
> [nifi-framework-core-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [na:1.8.0_77]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_77]
> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [na:1.8.0_77]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_77]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] Caused by: javax.script.ScriptException: SyntaxError: no viable alternative
> at input 'has_next_page' in <script> at line number 178 at column
> number 8
> at org.python.jsr223.PyScriptEngine.scriptException(PyScriptEngine.java:190)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:75)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.jsr223.PyScriptEngine.eval(PyScriptEngine.java:31)
> ~[jython-standalone-2.7.0.jar:na]
> at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264)
> ~[na:1.8.0_77]
> at org.apache.nifi.processors.script.impl.JythonScriptEngineConfigurator.eval(JythonScriptEngineConfigurator.java:59)
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> at org.apache.nifi.processors.script.ExecuteScript.onTrigger(ExecuteScript.java:204)
> ~[nifi-scripting-processors-1.1.0.2.1.1.0-2.jar:1.1.0.2.1.1.0-2]
> ... 11 common frames omitted Caused by: org.python.core.PySyntaxError: null
> at org.python.core.ParserFacade.fixParseError(ParserFacade.java:95)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.core.ParserFacade.parseExpressionOrModule(ParserFacade.java:136)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:320)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:316)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.util.PythonInterpreter.compile(PythonInterpreter.java:308)
> ~[jython-standalone-2.7.0.jar:na]
> at org.python.jsr223.PyScriptEngine.compileScript(PyScriptEngine.java:70)
> ~[jython-standalone-2.7.0.jar:na]
> ... 15 common frames omitted
感谢您的指导。感谢。
答案 0 :(得分:1)
从源代码中可以看出,您在以下行中缺少右括号:
flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data")
如果在此之后仍然遇到错误,则其中一个导入的模块可能包含(或依赖于)本机(CPython)模块而不是纯Python。前者在Jython中不受支持,导入的模块必须是纯Python代码。有关详细信息,请参阅this related SO post。