我有以下代码,在从文件中读取时成功地删除行尾字符,但对于任何前导和尾随空格都不会这样做(我希望中间的空格被留下!)
实现这一目标的最佳方法是什么? (注意,这是一个具体的例子,因此不能删除剥离字符串的一般方法)
我的代码 :(尝试使用测试数据:" Moose先生"(未找到)和如果你尝试"穆斯先生" (这是穆斯之后的空间)它会起作用。
#A COMMON ERROR is leaving in blank spaces and then finding you cannot work with the data in the way you want!
"""Try the following program with the input: Mr Moose
...it doesn't work..........
but if you try "Mr Moose " (that is a space after Moose..."), it will work!
So how to remove both new lines AND leading and trailing spaces when reading from a file into a list. Note, the middle spaces between words must remain?
"""
alldata=[]
col_num=0
teacher_names=[]
delimiter=":"
with open("teacherbook.txt") as f:
for line in f.readlines():
alldata.append((line.strip()))
print(alldata)
print()
print()
for x in alldata:
teacher_names.append(x.split(delimiter)[col_num])
teacher=input("Enter teacher you are looking for:")
if teacher in teacher_names:
print("found")
else:
print("No")
生成列表alldata 时,期望输出
['Mr Moose:Maths', 'Mr Goose:History', 'Mrs Congenelipilling:English']
即 - 删除开头处以及分隔符之前或之后的所有前导和尾随空格。必须留下像穆斯先生这样的词之间的空间。
教师资料的内容:
Mr Moose : Maths
Mr Goose: History
Mrs Congenelipilling: English
提前致谢
答案 0 :(得分:7)
您可以使用正则表达式:
txt='''\
Mr Moose : Maths
Mr Goose: History
Mrs Congenelipilling: English'''
>>> [re.sub(r'\s*:\s*', ':', line).strip() for line in txt.splitlines()]
['Mr Moose:Maths', 'Mr Goose:History', 'Mrs Congenelipilling:English']
所以你的代码变成了:
import re
col_num=0
teacher_names=[]
delimiter=":"
with open("teacherbook.txt") as f:
alldata=[re.sub(r'\s*{}\s*'.format(delimiter), delimiter, line).rstrip() for line in f]
print(alldata)
for x in alldata:
teacher_names.append(x.split(delimiter)[col_num])
print(teacher_names)
打印:
['Mr Moose:Maths', 'Mr Goose:History', 'Mrs Congenelipilling:English']
['Mr Moose', 'Mr Goose', 'Mrs Congenelipilling']
关键部分是正则表达式:
re.sub(r'\s*{}\s*'.format(delimiter), delimiter, line).rstrip()
^ 0 to unlimited spaced before the delimiter
^ place for the delimiter
^ unlimited trailing space
对于所有Python解决方案,我会使用str.partition来获取分隔符的左侧和右侧,然后根据需要删除空格:
alldata=[]
with open("teacherbook.txt") as f:
for line in f:
lh,sep,rh=line.rstrip().partition(delimiter)
alldata.append(lh.rstrip() + sep + rh.lstrip())
相同的输出
另一个建议。您的数据更适合dict
而不是列表。
你可以这样做:
di={}
with open("teacherbook.txt") as f:
for line in f:
lh,sep,rh=line.rstrip().partition(delimiter)
di[lh.rstrip()]=rh.lstrip()
或理解版:
with open("teacherbook.txt") as f:
di={lh.rstrip():rh.lstrip()
for lh,_,rh in (line.rstrip().partition(delimiter) for line in f)}
然后像这样访问:
>>> di['Mr Moose']
'Maths'
答案 1 :(得分:3)
无需使用readlines()
,您可以简单地遍历文件对象以获取每一行,并使用strip()
删除\n
和空格。因此,您可以使用此列表理解;
with open('teacherbook.txt') as f:
alldata = [':'.join([value.strip() for value in line.split(':')])
for line in f]
print(alldata)
输出;
['Mr Moose:Maths', 'Mr Goose:History', 'Mrs Congenelipilling:English']
答案 2 :(得分:2)
变化:
teacher_names.append(x.split(delimiter)[col_num])
为:
teacher_names.append(x.split(delimiter)[col_num].strip())
答案 3 :(得分:2)
删除开头处以及分隔符之前或之后的所有前导和尾随空格。必须留下像穆斯先生这样的词之间的空间。
您可以在分隔符处拆分字符串,从中删除空格,然后将它们重新连接在一起:
for line in f.readlines():
new_line = ':'.join([s.strip() for s in line.split(':')])
alldata.append(new_line)
示例强>:
>>> lines = [' Mr Moose : Maths', ' Mr Goose : History ']
>>> lines
[' Mr Moose : Maths', ' Mr Goose : History ']
>>> data = []
>>> for line in lines:
new_line = ':'.join([s.strip() for s in line.split(':')])
data.append(new_line)
>>> data
['Mr Moose:Maths', 'Mr Goose:History']
答案 4 :(得分:1)
您可以使用regex - re.sub:
轻松完成import re
re.sub(r"[\n \t]+$", "", "aaa \t asd \n ")
Out[17]: 'aaa \t asd'
第一个参数模式 - [
您要删除的所有字符]+
+ - 一个或多个匹配$
$ - 字符串结尾
答案 5 :(得分:-2)
使用string.rstrip('something')你可以从字符串的右端删除那个'something',如下所示:
a = 'Mr Moose \n'
print a.rstrip(' \n') # prints 'Mr Moose\n' instead of 'Mr Moose \n\n'