使用来自不相关表的数据填充交集表

时间:2017-04-28 16:33:33

标签: python postgresql psycopg2

我正在从网上抓取的json数据构建一个关系数据库(多对多)。我有两个主数据库,一个有909行,另一个有13行。我需要将大表的id与交集表中小表的id匹配,对应于它们在json文件中的链接方式。我发现的问题是这些表没有任何关联,但我需要填充交集表。没有任何我尝试或想到的东西会填充它,除了手工完成它需要几天。下面是json文件的一小部分示例。大表有课程信息,小表使用履行密钥。

[
{"number": "CHIN 242", "subject": "Chinese", "title": "Chinese Cinema and Chinese Modernity", "description": "From the fall of the Clestial Empire to the rise of China's economy today, Chinese cinema has witnessed many social changes in the modern era. This course will focus on the interaction between Chinese cinema and the process of modernization.", "fulfills": ["Human Expression\u2014Primary Texts", "Intercultural"]},

{"number": "CLAS 240", "subject": "Classics", "title": "Classical Mythology", "description": "A survey of the major myths and legends of ancient Greece and Rome.", "fulfills": ["Human Expression\u2014Primary Texts", "Quantitative"]},

{"number": "CLAS 250", "subject": "Classics", "title": "The World of Ancient Greece", "description": "A historical survey of ancient Greek culture from the Trojan War to the rise of Rome.", "fulfills": ["Human Expression\u2014Primary Texts", "Religion"]},

{"number": "CLAS 255", "subject": "Classics", "title": "Ancient Roman Culture", "description": "This course explores various cultural institutions and practices of the ancient Romans.", "fulfills": ["Human Expression\u2014Primary Texts", "Human Behavior"]},

{"number": "CLAS 265", "subject": "Classics", "title": "Greece and Rome on Film", "description": "This course explores the ways in which various events and episodes from Greek and Roman myth and history have been adapted for modern film and television.", "fulfills": ["Human Expression\u2014Primary Texts"]},

{"number": "CLAS 270", "subject": "Classics", "title": "Archaeology of Ancient Greece", "description": "An in-depth study of the archaeology of ancient Greece, with a focus on the high points of Greek civilization and material culture.", "fulfills": ["Historical", "Human Expression\u2014Primary Texts"]},

{"number": "CLAS 275", "subject": "Classics", "title": "Archaeology of Ancient Rome", "description": "This course explores the archaeology of ancient Rome from its early beginnings to its rapid growth into one of the world's largest empires.", "fulfills": ["Historical", "Human Expression\u2014Primary Texts"]},

{"number": "CLAS 300", "subject": "Classics", "title": "Classics and Culture", "description": "Using texts in translation, this course explores select aspects or themes from the cultures of ancient Greece and Rome.", "fulfills": ["Human Expression\u2014Primary Texts"]},
]

1 个答案:

答案 0 :(得分:0)

我找到了解决方法。要填充交集表,请对id运行查询并在小表上执行,id和简单的数字,然后为每个查询运行myDict = dict(map(reversed,cur.fetchall()))。从那里,使用for循环迭代json数据,然后使用嵌套for循环来迭代履行中的列表项cur.execute(insert into ...)。它看起来像这样:

cur.execute('select id, fulfills from reqs;')
dict1 = dict(map(reversed,cur.fetchall()))
cur.execute('select id, number from course;')
dict2 = dict(map(reversed,cur.fetchall()))
for x in geneds:
    for y in x:
        cur.execute('insert into table(course, req) values(%s, %s);', (dict2[x['number']], dict1[y]))