从datastax图中的相同csv文件创建边和顶点

时间:2016-09-15 18:21:30

标签: datastax datastax-enterprise datastax-startup datastax-enterprise-graph

我正在从datastax graph中的dataloader加载csv文件。

我的csv文件结构如下

第一个文件(Year_2015.txt)

YearID

第二个文件(BaseVehicle_2005.txt)

BaseVehicleID | YearID | MakeID | ModelID

对于第一个文件我创建顶点级别作为年份,键作为YearID作为第二个我创建顶点级别作为BaseVehicle而键作为BaseVehicleID并忽略YearID,MakeID,ModelID。现在我想在第二个(BaseVehicle)和第一个(Year)之间创建边缘级别年份和属性YearID,但是没有任何东西可以帮助我。请让我知道我需要改变什么?

1 个答案:

答案 0 :(得分:0)

文档包含示例:https://docs.datastax.com/en/latest-dse/datastax_enterprise/graph/dgl/dglCSV.html

下面是一个处理JSON数据的示例加载器脚本,但它显示了如何从同一记录加载顶点和边缘,使用映射器来决定使用哪些元素。

config load_threads: 8
config batch_size: 1000
config load_new: true
config driver_retry_attempts: 10
config driver_retry_delay: 1000

/** SAMPLE INPUT
  {"actor_name":"'Bear'Boyd, Steven","title_name":"The Replacements","year":"2000","role":"Defensive Tackle - Washington Sentinels","episode":"The Replacements"}
 */
input = File.json(filename )

//Defines the mapping from input record to graph elements
actorMapper = {
    key "actor_name"           // the unique id for an actor
    label "actor"
    ignore "title_name"
    ignore "role"
    ignore "year"
    ignore "episode"
}

titleMapper = {
    key "title_name"           // the unique id for a title
    label "title"
    ignore "sex"
    ignore "actor_name"
    ignore "role"
    ignore "episode"
}

castMapper = {
    label "cast"
    outV "actor_name", {
        label "actor"
        key "actor_name"
    }
    inV "title_name", {
        label "title"
        key "title_name"
    }
    ignore "year"
    ignore "sex"
    // remaining should be edge properties
    // pickup role as property
    // pickup episode as property
}

//Load vertex records from the input according to the mapping
load(input).asVertices(actorMapper)
load(input).asVertices(titleMapper)
load(input).asEdges(castMapper)