Question

读取文件内容的问题是，当读入列表时，它将其格式化为一个大字符串。学生需要能够使用这个＆＃34;阅读＆＃34;在来自文件的数据中，隔离ID号，并返回Student（例如）。

我知道有几种方法可以做到这一点，例如正则表达式，转换为字符串，以及使用split方法，但出于教学目的，我们会对最简单，最优雅的方法感兴趣（并且优雅，我的意思是避免多个不必要的步骤）。理想情况下，是否有一种方法可以直接从文本文件中以所需格式将其读入列表：

例如，

而不是当前格式（也包括我需要删除的\ n）：

['001,Joe,Bloggs,Test1:99,Test2:100,Test3:33\n', '002,Ash,Smith,Test1:22,Test2:63,Test3:99\n']

所需格式： 1d或2d列表，如下所示

[['001','Joe','Bloggs','Test1:99','Test2:100','Test3:33'],['002','Ash','Smith','Test1:22','Test2:63','Test3:99']]

我很高兴人们发布解决方案，包括reg ex和split string，因为它会帮助其他人，但有没有办法更简单地做到这一点？

包含文本文件的完整代码列表（在线重播：

https://repl.it/J8jB/2

代码：

f = open("studentinfo.txt","r") 
myList = []
for line in f:
    myList.append(line)
print(myList)
print()
print()
print(myList[0])
myList.split(",")
print(myList)

#split the list where all the individual elements in the current string (in the list) are split up at the ","

文字档案

001,Joe,Bloggs,Test1:99,Test2:100,Test3:33
002,Ash,Smith,Test1:22,Test2:63,Test3:99

Answer 1

构建列表后（或直接将文件句柄设置为l，不需要先存储列表）我只需rstrip和split列表理解这样：

l = ['001,Joe,Bloggs,Test1:99,Test2:100,Test3:33\n', '002,Ash,Smith,Test1:22,Test2:63,Test3:99\n']

newl = [v.rstrip().split(",") for v in l]

print(newl)

结果：

[['001', 'Joe', 'Bloggs', 'Test1:99', 'Test2:100', 'Test3:33'], ['002', 'Ash', 'Smith', 'Test1:22', 'Test2:63', 'Test3:99']]

对于一个平面列表而是做一个双循环（或使用itertools.chain.from_iterable，有很多方法可以做到这一点）：

newl = [x for v in l for x in v.rstrip().split(",")]

没有listcomp（只是为了“可读性”当你不习惯listcomps，之后切换到listcomps :)）：

newl = []
for v in l:
    newl.append(v.rstrip().split(","))

（使用extend代替append获取平面列表）

当然我总是忘记提及csv，其默认分隔符为逗号并删除换行符：

import csv
newl = list(csv.reader(l))

持平（这次使用itertools）：

newl = list(itertools.chain.from_iterable(csv.reader(l)))

（l可以是文件句柄或csv模块的行列表

Answer 2

这是csv模块的一个很好的用例：

import csv

with open("studentinfo.txt","r") as f:
    rd = csv.reader(f)
    lst = list(rd)    # lst is a list of lists in expected format
    ...               # further processing on lst

或者，逐行处理文件是微不足道的

with open("studentinfo.txt","r") as f:
    rd = csv.reader(f)
    for row in rd:          # row is list of fields
        ...                 # further processing on row

将从逗号中的文件读入的列表拆分为单独元素的列表

2 个答案: