Question

我创建了一个提取＆＃34; date＆＃34;我在文本文件中的每篇文章（每篇文章的第4或第5行）。现在的挑战是创建一个只有月份和年份的文本文件。这是我的主要功能：

def main():
    for i in range(len(sections)): 
        print(sections[i].split("\n")[4])       
        print(sections[i].split("\n")[5])
main()

这给了我以下文字：

可以看出，日期不存储在每一列中。此外，日期格式各不相同：存储在原始文本的第4行中的日期格式显示日期（前6个），而存储在第5列中的日期格式不显示日期。

理想情况下，文本文件如下所示：

2005年12月
2005年12月
2005年12月
2005年12月
2005年12月
2005年11月
...

非常感谢！

Answer 1

尽可能保持代码结构，这应该是您正在寻找的解决方案。它意味着易于阅读和理解。然而，它不是最好的解决方案，因为我们需要知道选择的样子，甚至更好的输入文件的样子。

def main():
    with open('output.txt', 'w') as f:
        for i in range(len(sections)):
            date_row4 = sections[i].split("\n")[4].split(" ")
            date_row5 = sections[i].split("\n")[5].split(" ")

            print(date_row4)
            print(date_row5)

            month_row4 = date_row4[1]
            year_row4 = date_row4[3]

            month_row5 = date_row5[1]
            year_row5 = date_row5[3]

            if len(month_row4): # avoid to write empty lines in the output
                f.write("{} {}{}".format(month_row4,year_row4,'\n'))
            if len(month_row4):
                f.write("{} {}{}".format(month_row5,year_row5,'\n'))
main()

Answer 2

你可以做的是构建正则表达式，从字符串，日，年，月中提取不同的部分。然后，一旦拥有了不同的组件，就可以轻松地以您想要的格式排列它们，并且在那一点上写入文本文件是微不足道的。

Answer 3

我认为可能是

def main():

    for s in sections:

        lines = s.split("\n")

        if lines[4]:
            parts = lines[4].split(' ')
            print(parts[0], parts[2]) 

        if lines[5]:
            parts = lines[5].split(' ')
            print(parts[0], parts[2])

编辑，带数字

def main():

    for number, s in enumerate(sections, 1):

        lines = s.split("\n")

        if lines[4]:
            parts = lines[4].split(' ')
            print(number, parts[0], parts[2]) 

        if lines[5]:
            parts = lines[5].split(' ')
            print(number, parts[0], parts[2])

创建日期

3 个答案: