如何创建由每个单词和每一行分隔的所有单词的新列表?

时间:2021-04-12 15:16:56

标签: python python-3.x string list

我有一个文本文件:

样本.txt

Hi I am student
I am from 

我试过的是

import string
import re

def read_to_list1(filepath):
    text_as_string = open(filepath, 'r').read()
    x = re.sub('['+string.punctuation+']', '', text_as_string).split("\n")
    
    for i in x:
        x_as_string = re.sub('['+string.punctuation+']', '', i).split()
        print(x_as_string)

read_to_list1('sample.txt')

这个结果

['Hi,'I','am','student']
['I','am','from']

我希望结果为:

[['Hi,'I','am','student'],['I','am','from']]

2 个答案:

答案 0 :(得分:1)

打开文件后,您可以使用列表推导式遍历行,并针对空白处的每一行 str.split 获取每个子列表的标记。

def read_to_list1(filepath):
    with open(filepath, 'r') as f_in:
        return [line.split() for line in f_in]

答案 1 :(得分:1)

对于具体示例 sample.txt,这也应该有效:

import string
import re

def read_to_list1(filepath):
    text_as_string = open(filepath, 'r').read()
    x = re.sub('['+string.punctuation+']', '', text_as_string).split("\n")
    final_array=[]
    for i in x:
        x_as_string = re.sub('['+string.punctuation+']', '', i).split()
        final_array.append(x_as_string)
    return final_array    
print(read_to_list1('sample.txt'))