gremlin用户在faunus脚本映射中定义了步骤

时间:2014-07-18 17:05:32

标签: user-defined-functions gremlin faunus

背景:我有几个月使用Gremlin和Faunus的经验,包括。 ScriptMap步骤。

问题:用户定义的Gremlin步骤在作为脚本的一部分加载到shell中时可以正常工作。但是,在Faunus ScriptMap脚本中定义时,相同的步骤显然没有效果。

 /***********Faunus Driver*************/

//usage gremlin -e <hhis file> NOTE: to run in gremlin remove .submit() at end of pipe
import Java.io.Console;
//get args
console = System.console()
mapperpath=console. readLine ('> <map script path>: ')
refns=console.readLine('> <reference namespace>: ')
refinterestkey-console.readLine('> <interest field>: ')
//currently not in use
refinterestval=console.readLine('> <interest value>: ')         
mainpropkey=console.readLine('> ^main field>: ')
delim=console.readLine('> <main delimiter>: ')
args=[]
args [0]=refns
args [1]=refinterestkey
args[2]=refinterestval
args [3]=mainpropkey
args [4]=delim
args=(String[]) args.toArray()
f=FaunusFactory.open('propertyfile')
f.V().filter('{it.get Property("_namespace") =="streamernamespace" && it.getProperty("_entity")==" selector"}').script(mapperpath, args).submit()
f.shutdown()

/***********Script Mapper*************/

Gremlin.defineStep ("findMatch", [Vertex, Pipe], 
    {streamer,  interestindicator, fieldofinterest, fun ->
    _().has (interestindicator , true).has(fieldofinterest, 
                                fun(streamer)
    }
)
Gremlin.defineStep("connectMatch", [Vertex, Pipe], {streamer ->
// copy and link streaming vertices to matching vertices in main graph 
_().transform({if(main!= null) {
        mylog.info("reference vertex " + main.id    
                               +" & streaming vertex"+streamer.id+" match on main " +main.getProperty(fieldofinterest));
        clone=g.addVertex(null);
        ElementHelper.copyProperties(streamer, clone);
        clone.setProperty("_namespace", main.getProperty("__namespace"));
        mylog.info("create clone "+clone.id+" in "+clone.getProperty("_namespace"));
        g.addEdge(main, clone, streamer.getProperty("source");
        mylog.info("created edge "+ e);
        g.commit()
    }})
})

def g
def refns
def refinterestkey
def refinterestval
def mainpropkey
def delim
def normValue

def setup(args) {
    refns=args[0] 
    refinterestkey=args[1]
    refinterestval=args[2] 
    mainpropkey=args[3] 
    delim=args[4] 
    normValue = {obj-> seltype=obj.getProperty("type");
            seltypenorm=seltype.trim().toUpperCase();   
            desc=obj.getProperty("description"); 
            if(desc.contains(delim}) (
                selnum=desc.split(delim) [1].trim ()
            } else selnum=desc.trim();
            selnorm=seltypenorm.concat(delim).concat(selnum); 
            mylog.info ("streamer selector (" + seltype", "+desc+") normalized as "+selnorm);
            return selnorm
   }
    mylog=java.util.logging.Logger.getLogger("script_map")
    mylog.info ("configuring connection to reference graph
    conf=new BaseConfiguration()
    conf.setProperty("storage.backend", "cassandra"}
    conf.setProperty!"storage.keyspace", "titan"}
    conf.setProperty("storage.index.index-name", "titan")
    conf.setProperty("storage.hostname", "localhost")
    g=TitanFactory.open(conf)
    isstepsloaded = Gremlin.getStepnames().contains("findMatch"} && 
    Gremlin.getStepNames().contain("connectMatch"}
    mylog.info("custom steps available?: "+isstepsloaded)
}
def map{v, args) { 
    try{
    incoming=g.v(v.id)
    mylog.info{"current streamer id: "+incoming.id)
    if(incoming.getProperty("_entity")=="selector") {
                    mylog.info("process incoming vertex "+incoming.id)          
                    g.V{"_namespace", refns).findMatch(incoming,refinterestkey, mainpropkey,normValue).connectMatch(incoming).iterate ()
    } 
    }catch(Exception e) {
            mylog.info("map method exception raised");
            mylog.severe(e.getMessage()
    }
            g.commit()
}
def cleanup(args) { g.shutdown()}

2 个答案:

答案 0 :(得分:1)

我刚刚用The Graph of the Gods上的用户定义步骤测试了Faunus,它似乎工作得很好。这是我做的:

<强> father.groovy

Gremlin.defineStep('father', [Vertex, Pipe], {_().out('father')})

def g

def setup(args) {
    conf = new org.apache.commons.configuration.BaseConfiguration()
    conf.setProperty('storage.backend', 'cassandrathrift')
    conf.setProperty('storage.hostname', '192.168.2.110')
    g = com.thinkaurelius.titan.core.TitanFactory.open(conf)
}

def map(v, args) {
    u = g.v(v.id)
    pipe = u.father().name
    if (pipe.hasNext()) u.fathersName = pipe.next()
    u.name + "'s father's name is " + u.fathersName
}

def cleanup(args) {
    g.shutdown()
}

在Faunus'Gremlin REPL:

gremlin> g.V.has('type','demigod','god').script('father.groovy')
...
==>jupiter's father's name is saturn
==>hercules's father's name is jupiter
==>neptune's father's name is null
==>pluto's father's name is null

如果这无法解决您的问题,请提供更多详细信息,以便我们重现您看到的错误。

干杯, 丹尼尔

答案 1 :(得分:0)

根本问题是我为“storage.index.index-name”属性设置了一个过时的值(请参阅setup()下的titan graph config。忽略讨论重新getOrCreate方法/蓝图:显然有很多突变<使用Faunus脚本步骤中引用的脚本中定义的自定义Gremlin步骤,使用faunus格式NoOpOutputFormat,可以大规模实现strong>现有图表。获得的经验教训:不是在脚本中内嵌配置titan图,而是使用(集中维护)图形属性文件,以供配置titan图形CDH5 has simplified distributed cache management

时参考