将成人数据集从文件加载到数据库

时间:2019-05-20 10:45:15

标签: sql postgresql dataset plpgsql

我已经创建了一些基础汇总,我想对其进行测试。 我想使用从https://archive.ics.uci.edu/ml/datasets/adult此链接导入的成人数据集。 我已经创建了表,该表可以包含数据,但是无法上传(文件为成人测试)。 有办法吗?

我已经在notepad ++中打开文件,在字符串类型值上插入了引号,但是有39K行。 我不能输入39K次 INSERT INTO

有帮助吗?

1 个答案:

答案 0 :(得分:2)

在打开文件的Notepad ++中,使用regex replace创建语句。请记住,执行一次INSERT而不是为32k +行中的每一行创建一个新的INSERT语句,都比

RegisterUser(f: NgForm) { if (f.valid) { var form_submitted = f.value this.loading = true; this.ifregistration = false; } } 上,以前5行为例:

adult.data

替换->搜索模式:正则表达式

查找内容:39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K 50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K 38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K 53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K 28, Private, 338409, Bachelors, 13, Married-civ-spouse, Prof-specialty, Wife, Black, Female, 0, 0, 40, Cuba, <=50K

替换为:^([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+), ([^,]+)$

\($1, '$2', $3, '$4', $5, '$6', '$7', '$8', '$9', '$10', $11, $12, $13, '$14', '$15'\),

Replace All

现在,您只需将(39, 'State-gov', 77516, 'Bachelors', 13, 'Never-married', 'Adm-clerical', 'Not-in-family', 'White', 'Male', 2174, 0, 40, 'United-States', '<=50K'), (50, 'Self-emp-not-inc', 83311, 'Bachelors', 13, 'Married-civ-spouse', 'Exec-managerial', 'Husband', 'White', 'Male', 0, 0, 13, 'United-States', '<=50K'), (38, 'Private', 215646, 'HS-grad', 9, 'Divorced', 'Handlers-cleaners', 'Not-in-family', 'White', 'Male', 0, 0, 40, 'United-States', '<=50K'), (53, 'Private', 234721, '11th', 7, 'Married-civ-spouse', 'Handlers-cleaners', 'Husband', 'Black', 'Male', 0, 0, 40, 'United-States', '<=50K'), (28, 'Private', 338409, 'Bachelors', 13, 'Married-civ-spouse', 'Prof-specialty', 'Wife', 'Black', 'Female', 0, 0, 40, 'Cuba', '<=50K'), 放在文件的顶部,删除文件末尾的所有逗号即可对您进行排序。