尝试使用oetl.sh实用程序将简单的json数据文件加载到OrientDB中。
这是我的输入数据文件(/tmp/databases/test_data1/database.json)。
[
{
"id": 1,
"name" : "xyz"
},
{
"id": 2,
"name" : "pqr"
},
{
"id": 3,
"name" : "abc"
}
]
这是我的配置json文件(/tmp/json_import_config.json)。
{
"config": {
"log": "debug"
},
"source" : {
"file": { "path": "/tmp/databases/test_data1/database.json" }
},
"extractor" : {
"json": {}
},
"transformers": [
{
"log": {}
}
],
"loader" : {
"orientdb": {
"dbURL": "plocal:/opt/orientdb/databases/example3",
"dbUser": "admin",
"dbPassword": "admin",
"dbAutoDropIfExists": true,
"dbAutoCreate": true,
"standardElementConstraints": false,
"tx": false,
"wal": false,
"batchCommit": 1000,
"dbType": "document",
"classes": [{"name": "Account"}]
}
},
"end": []
}
这是我正在使用的命令。
./oetl.sh /tmp/json_import_config.json
这是输出....
OrientDB etl v.2.2.20 (build 76ab59e72943d0ba196188ed100c882be4315139) https://www.orientdb.com
[file] INFO Load from file /tmp/databases/test_data1/database.json
[orientdb] INFO Dropping existent database 'plocal:/opt/orientdb/databases/example3'...
BEGIN ETL PROCESSOR
[file] INFO Reading from file /tmp/databases/test_data1/database.json with encoding UTF-8
Started execution with 1 worker threads
+ extracted 0 entries (0 entries/sec) - 0 entries -> loaded 0 documents (0 documents/sec) Total time: 1000ms [0 warnings, 0 errors]
[orientdb] DEBUG - OrientDBLoader: created class 'Account'
[orientdb] DEBUG orientdb: found 0 documents in class 'null'
Start extracting
[0:log] DEBUG Transformer input: {id:1,name:xyz}
Extraction completed
[0:log] INFO {id:1,name:xyz}
[0:log] DEBUG Transformer output: {id:1,name:xyz}
Pipeline execution halted
2018-12-06 13:47:41:386 SEVER {db=example3} ETL process halted: com.orientechnologies.orient.etl.OETLProcessHaltedException: Cannot insert new document {id:1,name:xyz} because it has not class [OETLProcessor$OETLPipelineWorker][orientdb] INFO committing
Pipeline worker done without errors: false
END ETL PROCESSOR
+ extracted 3 entries (15 entries/sec) - 3 entries -> loaded 0 documents (0 documents/sec) Total time: 1190ms [0 warnings, 1 errors]
需要帮助解决此问题。还想知道OrientDB是否只是作为文档存储使用它的好选择,因为它没有找到很多用例。大多数用例都是w.r.t.曲线图。
答案 0 :(得分:1)
您的配置几乎是正确的,您需要将类分配给管道正在处理的每个文档。添加一个设置类名的字段转换器:
"transformers": [
{
"log": {}
},
{
"field": {
"fieldName": "@class",
"value": "Account"
}
}],
我在本地测试,这是控制台的输出:
orientdb {db=docDb}> select from Account
+----+-----+-------+----+----+
|# |@RID |@CLASS |id |name|
+----+-----+-------+----+----+
|0 |#25:0|Account|1 |xyz |
|1 |#26:0|Account|2 |pqr |
|2 |#27:0|Account|3 |abc |
+----+-----+-------+----+----+
3 item(s) found. Query executed in 0.006 sec(s).