OrientDB 2.2.7 ETL for CSV没有加载DateTime字段?

时间:2016-08-17 00:57:23

标签: orientdb orientdb2.2 orientdb-etl

我正在尝试使用ETL加载器加载一个简单示例但我必须遗漏一些东西。我已经在Stack Overflow上关注了各种各样的线程并且已经在documentation on extractors进行了操作,但是我的尝试时间很短。

这是我的数据: vertices.csv

label,data,Date
v01,0.1234,2015-01-01 02:30
v02,0.5678,2015-02-20 15:32
v03,0.9012,2015-03-30 11:00

我正在设置两个JSON文件,尝试将其加载到PLOCAL数据库中:

vertices.json

{
    "config": {
        "log": "debug",
        "fileDirectory": "./",
        "fileName": "vertices.csv"
    }
}

commonVertices.json

{
    "begin": [ { "let": { "name": "$filePath",  "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "info" },
    "source": { "file": { "path": "$filePath" } },
    "extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "columnsOnFirstLine": true,
                            "dateFormat": "yyyy-mm-dd HH:MM",
                            "columns": ["label:string","weight:float","Date:datetime"]
                          }
                 },
    "transformers": [
            { "vertex": { "class": "myVertex" } },
            { "code":   { "language": "Javascript", "code": "print('    Current record: ' + record); record;" } }
        ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:test.orientdb",
            "dbType": "graph",
            "batchCommit": 1000,
            "classes": [ { "name": "myVertex", "extends", "V" } ],
            "indexes": [ { "class": "myVertex", "fields":["label:string","Date:datetime"], "type":"NOTUNIQUE" } ]
        }
    }
}

我正在使用oetl.sh使用以下命令加载它:

$ oetl.sh commonVertices.json vertices.json

带有调试信息的输出位于:

> oetl.sh commonVertices.json vertices.json
OrientDB etl v.2.2.7 (build 2.2.x@rdcab5af4dce4b538bdb4b372abba46e3fc9f19b7; 2016-08-11 15:17:33+0000) www.orientdb.com
[csv] INFO column types: {weight=FLOAT, Date=DATETIME, label=STRING}
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./vertices.csv with encoding UTF-8
Started execution with 1 worker threads
[orientdb] DEBUG orientdb: found 9 vertices in class 'null'
[orientdb] DEBUG orientdb: found metadata field 'null'
Start extracting
[csv] DEBUG document={weight:0.1234,Date:null,label:v01}
[csv] DEBUG document={weight:0.5678,Date:null,label:v02}
[1:vertex] DEBUG Transformer input: {weight:0.1234,Date:null,label:v01}
[csv] DEBUG document={weight:0.9012,Date:null,label:v03}
[1:vertex] DEBUG Transformer output: v(myVertex)[#25:3]
[1:code] DEBUG Transformer input: v(myVertex)[#25:3]
    Current record: myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[1:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[1:code] DEBUG Transformer output: myVertex#25:3{weight:0.1234,Date:null,label:v01} v1
[2:vertex] DEBUG Transformer input: {weight:0.5678,Date:null,label:v02}
[2:vertex] DEBUG Transformer output: v(myVertex)[#26:3]
[2:code] DEBUG Transformer input: v(myVertex)[#26:3]
    Current record: myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[2:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[2:code] DEBUG Transformer output: myVertex#26:3{weight:0.5678,Date:null,label:v02} v1
[3:vertex] DEBUG Transformer input: {weight:0.9012,Date:null,label:v03}
[3:vertex] DEBUG Transformer output: v(myVertex)[#27:3]
[3:code] DEBUG Transformer input: v(myVertex)[#27:3]
    Current record: myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[3:code] DEBUG executed code=OCommandExecutorScript [text=print('    Current record: ' + record); record;], result=myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[3:code] DEBUG Transformer output: myVertex#27:3{weight:0.9012,Date:null,label:v03} v1
[orientdb] INFO committing
Pipeline worker done without errors:: true
all items extracted
END ETL PROCESSOR
+ extracted 3 rows (0 rows/sec) - 3 rows -> loaded 3 vertices (0 vertices/sec) Total time: 149ms [0 warnings, 0 errors]

加载... 日期字段未填充此查询所示的任何数据:

orientdb {db=test.orientdb}> SELECT FROM myVertex

+----+-----+--------+------+----+-----+
|#   |@RID |@CLASS  |weight|Date|label|
+----+-----+--------+------+----+-----+
|0   |#25:0|myVertex|0.1234|    |v01  |
|1   |#26:0|myVertex|0.5678|    |v02  |
|2   |#27:0|myVertex|0.9012|    |v03  |
+----+-----+--------+------+----+-----+

3 item(s) found. Query executed in 0.003 sec(s).

到目前为止,在修补过程中,如果将“dateFormat”和“columns”字段保留在commonVertices.json文件之外,似乎ETL将导入日期,但这样做可能会导入 DATE 但它不会导入时间。

我有点卡在这一点上,它看起来像是一个错误,但我是OrientDB的新手,所以希望只是一个用户错误,有一个简单的解决方案。

一如既往,非常感谢任何帮助!

1 个答案:

答案 0 :(得分:1)

我试过

"extractor": { "csv": { "ignoreEmptyLines": true,
                            "nullValue": "N/A",
                            "columnsOnFirstLine": true,
                            "dateFormat": "yyyy-MM-dd hh:mm"
                          }
                 },

并且有效

enter image description here

希望它有所帮助。