将大型CSV文件导入OrientDB

时间:2015-10-28 12:03:37

标签: csv import orientdb

我是orient-db的新手,所以我正在使用Orient-db中的航班搜索图数据库。我有数百万个真实航班数据,我创建了JSON文件来导入csv文件,但导入所有数百万个数据需要数小时。它每秒仅导入大约500行。

我使用etl导入csv文件。

这是我的json文件

{
"source": {
    "file": {
        "path": "C:/Users/sams/Desktop/OrientDB2/flights.csv"
    }
},
"extractor": {
    "csv": {}
},
"transformers": [
    {
        "vertex": {
            "class": "Flight"
        }
    },
    {
        "edge":  
            {
                "class": "Has_Flight",
                "joinFieldName": "depart_airport_id",
                "lookup": "Airport.airport_id",
                "direction": "in"
            } 
    },
    {
        "edge":
        {
                "class": "Flying_To",
                "joinFieldName": "arrive_airport_id",
                "lookup": "Airport.airport_id",
                "direction": "out"
        }
    }

],
"loader": {
    "orientdb": {
        "dbURL": "plocal:C:/Users/sams/Desktop/OrientDB2/database/dataflight",
        "dbType": "graph",
        "dbAutoCreate": true,
        "classes": [
            {
                "name": "Airport",
                "extends": "V"
            },
            {
                "name": "Flight",
                "extends": "V"
            },
            {
                "name": "Has_Flight",
                "extends": "E"
            },
            {
                "name": "Flying_To",
                "extends": "E"
            }
        ],
        "indexes": [
            {
                "class": "Airport",
                "fields": [
                    "airport_id:integer"
                ],
                "type": "UNIQUE"
            }
        ]
    }
}
}

所以我的问题是,是否还有其他机制可以在Orient-db中导入大数据集?

提前致谢!

1 个答案:

答案 0 :(得分:4)

你可以尝试禁用WAL,启用txLog和usebatching。

让我们试试:

"wal" = false
"batchCommit" = 1000
"txUseLog" = true

有关OrientDb加载程序的文档:http://orientdb.com/docs/2.1/Loader.html#orientdb

如果您找到可以改善表现的组合,请告诉我。