Question

因此，我正在编写此脚本，其中将根据每行单词的数量将文本文件分为多个列表，我需要生成字典，但无需担心；我在尝试拆分此文本时遇到了麻烦：

所以说我有：

word1:
word word

more words
word2:
another word
word3:
word4:

我想要：

[[[word:], [word word], [more words]],[[word2:], [another word]], 
[[word3:]], [[word4:]]]

这是代码：

from typing import List, Dict, TextIO, Tuple
def read_file(TextIO) -> Dict[str, List[tuple]]:

text = open('text_file.txt', 'r')
data = []
indexes = []

for line in text.readlines():
    l =  line.strip().split(',')
    data.append(l)
    for lists in data:
        if lists == ['']:
            data.remove(lists)

for elements in data:
    if len(elements) == 1:
        if ':' in elements[0][-1]:
            indexes.append(data.index(elements))

如何使用索引在需要的部分切割数据？还是不使用模块又可以如何将文本文件切成需要的部分？

Answer 1

您正在执行一系列没有意义的操作-可能是先前尝试遗留下来的。您没有任何包含逗号的数据，因此.split(',')已过时。我也看不到应该在indexes后面做些什么。

采用以下方法：将以:结尾的单词附加为新列表；将所有其他短语添加到最后一个列表。唯一的偏差是空白行。似乎应该将其丢弃，否则它将在其中一个列表中添加''。

因此，所需要做的只是这个短代码：

data = [] with open('text.txt', 'r') as text: for line in text: line = line.strip() if line: if line.endswith(':'): data.append([line]) else: data[-1].append(line) print (data)

根据要求输出：

[['word1:', 'word word', 'more words'], ['word2:', 'another word'], ['word3:'], ['word4:']]

如何在不使用模块的情况下使用每行字数在python中拆分文本文件

1 个答案: