将平面CSV文件中的ONE_2_MANY数据导入两个PostgreSQL表

时间:2016-12-09 09:54:29

标签: sql postgresql csv

我有以下PostgreSQL架构:

CREATE SEQUENCE user_question_rev_user_question_rev_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1;         

CREATE TABLE user_question_rev (
  user_question_rev_id integer NOT NULL DEFAULT nextval('user_question_rev_user_question_rev_id_seq'::regclass),
  email_address character varying(100),
  entry_id integer NOT NULL,
  create_date timestamp without time zone,
  CONSTRAINT user_question_rev_pkey PRIMARY KEY (user_question_rev_id)
);

CREATE SEQUENCE user_question_user_question_id_seq
  INCREMENT 1
  MINVALUE 1
  MAXVALUE 9223372036854775807
  START 1
  CACHE 1; 

CREATE TABLE user_question (
  user_question_id integer NOT NULL DEFAULT nextval('user_question_user_question_id_seq'::regclass),
  user_question_rev_id integer NOT NULL,
  question text NULL,
  answer text NULL,
  create_date timestamp without time zone,
  CONSTRAINT user_question_pkey PRIMARY KEY (user_question_id),              
  CONSTRAINT user_question_user_question_rev_id_fkey FOREIGN KEY (user_question_rev_id)
    REFERENCES user_question_rev (user_question_rev_id) MATCH SIMPLE
    ON UPDATE NO ACTION ON DELETE NO ACTION
);

另外,我有一个具有以下结构的CSV文件:

Email Address        |   Question 1  |  Question 2   |  Question 3   |  Entry Id  |  Entry Date
--------------------------------------------------------------------------------------------------------
user1@example.com    |   Answer 1_1  |  Answer 2_1   |  Answer 3_1   |  667       |  2016-12-02 06:15:13
user2@example.com    |   Answer 1_2  |  Answer 2_2   |  Answer 3_2   |  666       |  2016-12-02 05:15:59
user3@example.com    |   Answer 1_3  |  Answer 2_3   |  Answer 3_3   |  665       |  2016-12-01 05:20:22
user4@example.com    |   Answer 1_4  |  Answer 2_4   |  Answer 3_4   |  662       |  2016-11-29 15:16:58
user5@example.com    |   Answer 1_5  |  Answer 2_5   |  Answer 3_5   |  651       |  2016-11-28 16:14:52
user2@example.com    |   Answer 1_22 |  Answer 2_22  |  Answer 3_22  |  681       |  2016-12-03 02:11:01

我需要将此CSV文件中的数据加载到上面的PostgreSQL模式中。

对于CSV中的每条记录,我需要在user_question_rev表中创建一个适当的新记录,在user_question表中创建新记录(ONE_2_MANY)并放在那里:

(CSV) Email Address -> (user_question_rev) email_address
(CSV) Entry Id      -> (user_question_rev) entry_id
(CSV) Entry Date    -> (user_question_rev) create_date

和适当的ONE_2_MANY user_question.user_question_rev_id

user_question_rev.user_question_rev_id -> user_question.user_question_rev_id
(CSV) Question 1                       -> (user_question) question 
(CSV) Entry Date                       -> (user_question) create_date
(CSV) Question 2                       -> (user_question) question 
(CSV) Entry Date                       -> (user_question) create_date
(CSV) Question 3                       -> (user_question) question 
(CSV) Entry Date                       -> (user_question) create_date

请说明如何使用PostgreSQL和SQL。

1 个答案:

答案 0 :(得分:1)

  • 使用基于csv的结构创建导入表 - 如下所示 - 我推荐所有类型为text的列,这样您就可以无误导入。 CSV文件有时包含非常奇怪的值... create table my_import ( Email Address text, Question1 text, Question2 text, Question3 text, EntryId text EntryDate text )
  • 使用psql + COPY命令导入(如果文件在服务器上)或" \ copy" (如果文件在本地计算机上) - 请参阅文档
  • 并按照您在问题中描述的相同方式从此导入的表格中进行选择。只需将规则转换为更简单的SQL语句即可。在这样的情况下简单就是你的油炸:-)当然不要忘记适当的铸造成目标类型。