从dse Graph Loader中的aws-s3加载CSV数据

时间:2016-10-28 06:16:51

标签: datastax datastax-enterprise datastax-startup datastax-enterprise-graph

我有关于aws-s3的数据(采用csv格式),我想使用Graph Loader在dse图中加载该数据。我有搜索但没有找到这个主题。是否可以使用dse图形加载器?

1 个答案:

答案 0 :(得分:0)

以下是从csv中读取图形加载器的映射:

https://docs.datastax.com/en/latest-dse/datastax_enterprise/graph/dgl/dglCSV.html

这是一个HDFS示例(也有csv文件),S3应该类似(只需交换dfs_url:

// Configures the data loader to create the schema
config create_schema: true, load_new: true, preparation: true
// Define the data input sources
// dfs_uri specifies the URI to the HDFS directory in which the files are stored.
dfs_uri = 'hdfs://host:port/path/'
authorInput = File.csv(dfs_uri + 'author.csv.gz').gzip().delimiter('|')
//Specifies what data source to load using which mapper (as defined inline)
load(authorInput).asVertices
{ label "author" key "name" }
// graphloader call
./graphloader myMap.groovy -graph testHDFS -address localhost
// start gremlin console and check the data
bin/dse gremlin-console
:remote config reset g testHDFS.g
schema.config().option('graph.schema_mode').set('Development')
g.V().hasLabel('author')