Question

我如何能够遍历文本文件的每一行并使用Python将作者姓名复制到列表中？我正在使用的文本文件在每个引用的末尾包含以下引号和作者姓名：

Power tends to corrupt and absolute power corrupts absolutely. --- Lord Acton
No man means all he says, and yet very few say all they mean, for words are slippery and thought is viscous. --- Henry B. Adams
One friend in a lifetime is much; two are many; three are hardly possible. --- Henry B. Adams

Answer 1

试试这个：

authors_list = []
with open('file.txt', 'r') as f:
    for line in f:
        text = line.rstrip('\n').split(" --- ")
        if len(text) > 1:
            authors_list.append(text[1])

Answer 2

使用正则表达式，您可以按如下方式执行：

import re
import string

with open('text.txt') as f:
    txt = f.readlines()

authors = re.findall('(?<=---).*?(?=\n)', '\n'.join(txt))
authors = map(string.strip, authors)

Answer 3

这是一个基于生成器的解决方案，带来一点乐趣：

# Generate stream manipulators
def strip(stream):
    """Strips whitespace from stream entries"""

    for entry in stream:
        yield entry.strip()

def index(i, stream):
    """Takes the i-th element from the stream entries"""

    for entry in stream:
        yield entry[i]

def split(token, stream):
    """Splits the entries in the stream based based on the token"""

    for entry in stream:
        yield entry.split(token)

# Actual function to do the work
def authors(filename):
    """Returns a list of the authors from the file format"""

    for entry in strip(index(1, split('---', open(filename)))):
        yield entry

print list(authors('file.txt'))

基于生成器/过滤/管道的解决方案可以很好地完成这类任务。

Answer 4

下面的评价也应该有效。 readlines（）读取并将完整文件加载到内存中，但是当你有大文件时要小心使用它。对于较小的，这应该是相当不错的。

n = []
with open('test1.txt') as fd:
    lines = fd.readlines()
    for line in lines:
        n.append( line.split('---')[1].strip('\n'))

print n

输出： [＆＃39;阿克顿勋爵＆＃39;，＆＃39;亨利B.亚当斯＆＃39;，＆＃39;亨利B.亚当斯＆＃39;]

如何将文本文件行中的段复制到列表中

4 个答案: