我正在尝试将ETL加载程序用于OrientDB中的数据导入。我已经在Kubernetes集群中以分布式模式设置了OrientDB(版本3.0.10)。当我尝试在kubectl exec -it orientdbservice-0 -- /orientdb/bin/oetl.sh /orientdb/bin/data/import_json/venue.json
吊舱之一上使用导入时,它卡在此消息上,然后终止:
OrientDB etl v.3.0.10 - Veloce (build eac0654847df662ca03b45a6a5efa5eadd229ca5, branch 3.0.x) https://www.orientdb.com
结果,创建了数据库文件夹,但是数据没有完全导入。因为没有关于文件位置的错误消息,所以我认为源和提取器部分工作正常,但是我不确定json配置文件中的加载器部分。这种行为可能是什么原因?我可以使用其他方式导入数据吗?
这是我的destination.json文件:
{
"source": {"file": {
"path" : "data/dataset_csv/venues.csv"
}
},
"extractor": {
"csv": {
"columns": ["venue_id:string", "address_1:string", "city:string", "country:string", "distance:float", "lat:float", "localized_country_name:string", "lon:float", "venue_name:string", "rating:float", "rating_count:float", "state:string", "zip:integer", "normalised_rating:float"],
"columnsOnFirstLine": true
}
},
"transformers" : [
{ "vertex": { "class": "Venue" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:/orientdb/databases/MeetupCluster",
"dbAutoCreateProperties": true,
"dbType": "graph",
"wal": false,
"batchCommit": 1000,
"tx": true,
"txUseLog": false,
"useLightweightEdges" : true,
"classes": [
{"name": "Venue", "extends": "V"}
], "indexes": [
{"class": "Venue", "fields":["venue_id:string"], "type":"UNIQUE" }
]
}
}
}
注意:ETL加载程序在使用docker-compose创建的本地集群上可以正常工作。而且该文件是数据库中最小的文件之一,因此应该可以快速加载。
更新: 我已经以这种方式为远程数据库设置了一个URL:
"dbURL": "remote:localhost/MeetupCluster"
"serverUser": "root",
"serverPassword": "pwd"
"dbUser": "admin",
"dbPassword": "admin"
我得到了这样的错误:
Exception in thread "main" com.orientechnologies.orient.core.exception.OConfigurationException: Error on creating ETL processor
Caused by: com.orientechnologies.orient.core.exception.ODatabaseException: Cannot open database 'MeetupCluster'
Suppressed: com.orientechnologies.orient.core.exception.ODatabaseException: Cannot open database 'MeetupCluster'
Caused by: com.orientechnologies.orient.server.distributed.ODistributedException: No active nodes found to execute command: sql.select from OUser where name = ? limit 1