在python中拆分返回一个多余的空白字符

时间:2016-09-08 17:57:53

标签: python regex split

我有一个文件,其中包含我阅读的一些数据,与space,\n分开并将其放入矩阵中。 但我的代码会在我的矩阵中返回一个多余的空白字符。任何人都可以帮我找到这个bug吗?谢谢。 代码:

import re
lines = [re.split('[,\n ]',line) for line in open('lines.txt')]
print lines

输入:

395,0 398,100
398,100 488,196
488,196 544,233
544,233 506,301
506,301 425,344
425,344 336,355
336,355 271,319
271,319 293,264
293,264 328,232
328,232 329,170
329,170 267,175
267,175 228,199
228,199 214,220
214,220 80,268
80,268 0,273
0,183 96,176
96,176 168,92
168,92 252,124
252,124 300,88
300,88 303,40
303,40 309,0

输出(第五列超出):

[['395', '0', '398', '100', ''], ['398', '100', '488', '196', ''], ['488', '196', '544', '233', ''], ['544', '233', '506', '301', ''], ['506', '301', '425', '344', ''], ['425', '344', '336', '355', ''], ['336', '355', '271', '319', ''], ['271', '319', '293', '264', ''], ['293', '264', '328', '232', ''], ['328', '232', '329', '170', ''], ['329', '170', '267', '175', ''], ['267', '175', '228', '199', ''], ['228', '199', '214', '220', ''], ['214', '220', '80', '268', ''], ['80', '268', '0', '273', ''], ['0', '183', '96', '176', ''], ['96', '176', '168', '92', ''], ['168', '92', '252', '124', ''], ['252', '124', '300', '88', ''], ['300', '88', '303', '40', ''], ['303', '40', '309', '0', '']]

1 个答案:

答案 0 :(得分:2)

从文本文件中读取的行通常在末尾有一个换行符(除非它们是最后一行,在这种情况下它们可能不是)。看到新线被剥离(例如使用str.rstrip)非常常见:

import re
lines = [re.split('[,\n ]', line.rstrip('\n')) for line in open('lines.txt')]
print lines

顺便说一句,最好使用上下文管理器来管理打开的文件:

with open('lines.txt') as input_file:
    lines = [re.split('[,\n ]', line.rstrip('\n')) for line in input_file]
print lines