Question

我正在阅读格式如下的文件：

#*title
#tyear
#cVenue
#index0
#!abstract

#*title
#tyear
#cVenue
#index1
#!abstract

有数千块。每个块由空行分隔，每个块是表中的一行。我希望在读取每个块后插入到我的表中。然后我想清除变量，以便可以读取和插入下一个块。到目前为止，这是我的代码：导入MySQLdb

conn = MySQLdb.connect(host="localhost", user="root", db="Literature")
db1 = conn.cursor()

with open("path\to\my\file.txt", "rb") as    f:
for line in f:
    if line.startswith("#*"):
        title = line[2:]

    elif line.startswith("#t"):
        year = line[2:]

    elif line.startswith("#c"):
        Venue = line[2:]

    elif line.startswith("#index"):
        ID = line[6:]

    elif line.startswith("#!"):
        abstract = line[2:]

    elif line == '\n':

        db1.execute('''INSERT INTO my_table(
            ID, TITLE, YEAR, Venue, ABSTRACT)
            VALUES (%s,%s,%s,%s,%s)'''(ID, title, year, Venue, abstract))
        conn.commit()
        conn.close()

        title = None
        year = None
        Venue = None
        ID = None
        abstract = None

    else:
        continue

运行此代码时没有错误，但我的表是空的。有人可以指出我哪里出错了。我是否可以使用不同的方式检查我是否已经到了一个区块的末尾？

Answer 1

如果它是空的，你可以检查它：

elif line.strip() == '':
    # your code

或者您可以在生成文件时插入指示块结束的特殊字符。您也可以使用这样的正则表达式：

import re
# some code
elif re.match(r'^[\s\t]*$', line):
    # your code

python，读入文本文件并插入表中

1 个答案: