我正在尝试将KDD-CUP-99数据集(在此处找到:http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html)导入MongoDB。我使用以下命令在一台机器上完成了这个:
mongoimport --db dbName --collection colName --type csv --file kddcup.data.corrected --fieldFile kddcup99header
当我使用findOne()查看结果时,一切看起来都很好;输出如下:
> db.colName.findOne()
{
"_id" : ObjectId("547c33e376945996ed878f81"),
"duration" : 0,
"protocol_type" : "tcp",
"service" : "http",
"flag" : "SF",
"src_bytes" : 215,
"dst_bytes" : 45076,
"land" : 0,
"wrong_fragment" : 0,
"urgent" : 0,
"hot" : 0,
"num_failed_logins" : 0,
"logged_in" : 1,
"num_compromised" : 0,
"root_shell" : 0,
"su_attempted" : 0,
"num_root" : 0,
"num_file_creations" : 0,
"num_shells" : 0,
"num_access_files" : 0,
"num_outbound_cmds" : 0,
"is_host_login" : 0,
"is_guest_login" : 0,
"count" : 1,
"srv_count" : 1,
"serror_rate" : 0,
"srv_serror_rate" : 0,
"rerror_rate" : 0,
"srv_rerror_rate" : 0,
"same_srv_rate" : 1,
"diff_srv_rate" : 0,
"srv_diff_host_rate" : 0,
"dst_host_count" : 0,
"dst_host_srv_count" : 0,
"dst_host_same_srv_rate" : 0,
"dst_host_diff_srv_rate" : 0,
"dst_host_same_src_port_rate" : 0,
"dst_host_srv_diff_host_rate" : 0,
"dst_host_serror_rate" : 0,
"dst_host_srv_serror_rate" : 0,
"dst_host_rerror_rate" : 0,
"dst_host_srv_rerror_rate" : 0,
"unknown" : "normal."
}
现在我在另一台机器上运行相同的导入操作,使用相同的文件和命令,但有些东西无法正常工作。导入的结果如下:
> db.colName.findOne()
{
"_id" : ObjectId("547d8f94facff0761ae10688"),
" : 0, "duration
" : "tcp",rotocol_type
" : "http",rvice
" : "SF",flag
" : 215,"src_bytes
" : 45076,st_bytes
" : 0, "land
" : 0, "wrong_fragment
" : 0, "urgent
" : 0, "hot
" : 0, "num_failed_logins
" : 1, "logged_in
" : 0, "num_compromised
" : 0, "root_shell
" : 0, "su_attempted
" : 0, "num_root
" : 0, "num_file_creations
" : 0, "num_shells
" : 0, "num_access_files
" : 0, "num_outbound_cmds
" : 0, "is_host_login
" : 0, "is_guest_login
" : 1, "count
" : 1, "srv_count
" : 0, "serror_rate
" : 0, "srv_serror_rate
" : 0, "rerror_rate
" : 0, "srv_rerror_rate
" : 1, "same_srv_rate
" : 0, "diff_srv_rate
" : 0, "srv_diff_host_rate
" : 0, "dst_host_count
" : 0, "dst_host_srv_count
" : 0, "dst_host_same_srv_rate
" : 0, "dst_host_diff_srv_rate
" : 0, "dst_host_same_src_port_rate
" : 0, "dst_host_srv_diff_host_rate
" : 0, "dst_host_serror_rate
" : 0, "dst_host_srv_serror_rate
" : 0, "dst_host_rerror_rate
" : 0, "dst_host_srv_rerror_rate
"unknown" : "normal."
}
看到我正在使用相同的数据文件和命令,我认为它必须是环境中的东西。系统区域设置相同,但导入仍然无法正常工作。有没有人见过这种行为?
编辑我应该补充说两台机器都运行相同版本的MongoDB:2.6.5
答案 0 :(得分:1)
我建议你验证两台机器上的文件是否真的相同:
md5sum kddcup.data.corrected kddcup99header
并验证mongoimport
工具的版本:
mongoimport --version
答案 1 :(得分:0)
最终,我根据@ helmy的答案走了很长一段路。我从工作的Mongo实例导出并将其导入非工作实例。