Python& Sqlite3 - 将文本文件上传到SQL数据库不断破坏

时间:2017-07-04 19:09:39

标签: python sqlite

我有一个文本文件,包含大约662,000行。我想使用sqlite3将此文本文件中的每一行移动到我的数据库中。每行包含两个组件,一个键和一个公司名称。公司名称在一列中,而一个键在另一列中。

以下代码:

def input_txt_to_db():
    with open(our_txt_file) as f:
        for line in f:
            # Format each line 
            curr_comp_name = str(line.rsplit(':')[0])
            curr_comp_key = str(line.rsplit(':')[-2])
            # Create object of the line
            curr_comp = Company(curr_comp_name, curr_comp_key)
            # Insert company is a self-made method, listed below
            insert_company(curr_comp)

def insert_company(comp):
    """

    :param comp: Company (object)
    :return: None
    """
    with conn:
        conn_cursor.execute("INSERT INTO companies VALUES "
                            "(:name, :key)",
                            {'name': comp.name,
                             'key': comp.key
                             })

现在一切正常,我检查数据库以查看,并正确上传。 然而,一旦它说出60k行,就会崩溃。它给我一些错误,比如操作系统错误,或类似的东西。还要注意,我有足够的空间来存放这个数据库。

1 个答案:

答案 0 :(得分:1)

这似乎不是最有效的上传数据的方式。如何使用executemany按部分上传数据?

def insert_companies(comps):
    with conn:
        conn_cursor.executemany("INSERT INTO companies VALUES (?, ?)", comps)

我们必须稍微重新定义主要功能。让我们摆脱这些物品,我们现在不需要它们,对吗?

def input_txt_to_db():
    with open(our_txt_file) as f:
        batch = list()
        # How many companies do we dump to db at once?
        batch_size = 2000
        for line in f:
            # Format each line 
            curr_comp_name = str(line.rsplit(':')[0])  # why str? it should be string as it is
            curr_comp_key = str(line.rsplit(':')[-2])
            # Create object of the line
            batch.append((curr_comp_name, curr_comp_name))
            # Insert company is a self-made method, listed below
            if len(batch) == batch_size:
                 insert_companies(batch)
                 batch = list()
        # something may be still pending
        if batch:
            insert_companies(batch)

尝试一下,它应该有效。如果您提供有关发生的错误的更多信息,它也可能会有所帮助,因为现在没有足够的上下文来明确回答您的问题。