Question

我有一个.txt文件，其中包含一些单词：例如

bye

bicycle

bi
cyc
le

，我想返回一个包含文件中所有单词的列表。我已经尝试了一些实际上有效的代码，但是我认为执行较大的文件需要花费很多时间。有没有办法使这段代码更有效？

with open('file.txt', 'r') as f:
    for line in f:
        if line == '\n': --> #blank line 
            lst1.append(line)
        else:
            lst1.append(line.replace('\n', '')) --> #the way i find more efficient to concatenate letters of a specific word
    str1 = ''.join(lst1)
    lst_fin = str1.split()

预期输出：

lst_fin = ['bye', 'bicycle', 'bicycle']

Answer 1

我不知道这是否更有效，但至少是另一种选择...：）

with open('file.txt') as f:
    words = f.read().replace('\n\n', '|').replace('\n', '').split('|')
print(words)

...或者，如果您不想在数据中插入'|'之类的字符（可能已经在其中），也可以这样做

with open('file.txt') as f:
    words = f.read().split('\n\n')
    words = [w.replace('\n', '') for w in words]
print(words)

两种情况下的结果相同：

# ['bye', 'bicycle', 'bicycle']

编辑：

我认为我还有另一种方法。但是，它要求文件不要以空白行iiuc开始。

with open('file.txt') as f:
    res = []
    current_elmnt = next(f).strip()
    for line in f:
        if line.strip():
            current_elmnt += line.strip()
        else:
            res.append(current_elmnt)
            current_elmnt = ''
print(words)

也许您想尝试一下...

Answer 2

您可以将iter函数与''标记一起使用：

with open('file.txt') as f:
    lst_fin = list(iter(lambda: ''.join(iter(map(str.strip, f).__next__, '')), ''))

演示：https://repl.it/@blhsing/TalkativeCostlyUpgrades

Answer 3

您可以使用此（我不知道它的效率）：

lst = []
s = ''
with open('tp.txt', 'r') as file:
    l = file.readlines()
    for i in l:
        if i == '\n':
            lst.append(s)
            s = ''
        elif i == l[-1]:
            s += i.rstrip()
            lst.append(s)
        else:
            s+= i.rstrip()
print(lst)

将文件转换为列表的最快方法？

3 个答案: